Mbti Pdf 95808 | Ranlp 152

Partial capture of text on file.
                        HowtoObtainReliableLabelsforMBTIClassiﬁcationfromTexts?
                                                   ˇ
                                           Sanja Stajner                                    Seren Yenikent
                                        SymantoResearch                                   SymantoResearch
                                      Nuremberg, Germany                                Nuremberg, Germany
                             sanja.stajner@symanto.com                        seren.yenikent@symanto.com
                                         Abstract                             popularity of MBTI framework (it is estimated that
                       AutomaticdetectionoftheMyers-BriggsType                morethan2million US adults complete the inven-
                                                                                                 2
                       Indicator (MBTI) from short posts attracted            tory every year), there is a number of freely avail-
                       noticeable attention in the last few years. Re-        able alternative MBTI questionnaires on the inter-
                       cent studies showed that this is quite a difﬁ-         net, with the 16personalities test3 being one of the
                       cult task, especially on commonlyusedTwitter           most popular ones. According to the Myers-Briggs
                       data. Obtaining MBTI labels is also difﬁcult,                       4                                         5
                       as human annotation requires trained psychol-          Foundation and the 16personality test website,
                       ogists, and automatic way of obtaining them            both questionnaires satisfy the accepted standards
                       is through long questionnaires of questionable         for test validity and reliability. Nevertheless, the
                       usability for the task. In this paper, we present      MBTI questionnaires have received a noticeable
                       a method for collecting reliable MBTI labels           criticism from the academic community (Pittenger,
                       via only four carefully selected questions that        1993; Boyle, 1995) for not relying on a scientif-
                       can be applied to any type of textual data.            ically proven (i.e. data-driven) background, but
                   1   Introduction                                           rather on qualitative measures such as observation
                                                                              and introspection. The other common criticism is
                   TheMyers-Briggs Type Indicator (MBTI) model                the binary nature of the questionnaire as it is known
                   (Briggs-Myers and Myers, 1995) is one of the most          that the majority of people usually lies somewhere
                   widely used non-clinical psychometric models               in the middle of the scales (Pittenger, 1993).
                    ˇ
                   (Stajner and Yenikent, 2020). It classiﬁes people             The questionnaire-based personality detection
                   into two groups across four dimensions: extraver-          has several weaknesses: it requires trained human
                   sion/introversion (E/I), sensing/intuition (S/N),          assessors; it is prone to social desirability bias
                   thinking/feeling (T/F), and judgement/perception           (Krumpal, 2011) and reference-group effect (Heine
                   (J/P). This leads to a total of 16 personality types.      et al., 2002); it is questionable if answering ques-
                   The ﬁrst three dimensions are based on the theo-           tionnaires is a natural way of showing ones per-
                   retical work of Carl Jung (1921), while the fourth         sonality (as opposed to free writing or behaviour
                   dimension was added later by Myers and Briggs-             “whennobodywatches”). To detect MBTI typolo-
                   Myers(1995). The MBTI personality framework                gies in a more natural way and without necessity
                   has already been used for decades in educational           for trained human assessors, many studies have
                   and industry settings, e.g. for ﬁnding jobs that best      attempted at building systems for automatic de-
                   resonate with the person’s preferences for informa-        tection of MBTI personality types from text in
                   tion processing (S/N and T/F dimensions), ﬁnding           the last several years. Attempts have been made
                   workorganization types that best resonate with the         for automatic detection of MBTI personality types
                   person’s preferred judgement processes (J/P dimen-         from: tweets written in English (Plank and Hovy,
                   sion) thus leading to better job satisfaction, and         2015), six other Western European languages (Ver-
                   for better matching work environments with the
                   person’s preferences (E/I dimension) to lower em-          professional/versions-of-the-mbti-questionnaire/
                                                                                  2https://www.verywellmind.com/the-myers-briggs-type-
                   ployee turnover (Briggs-Myers and Myers, 1995).            indicator-2795583#the-mbti-today
                      The original MBTI questionnaire contains 93                 3https://www.16personalities.com/free-personality-test
                   questions and is not freely available.1 Due to the             4myersbriggs.org
                                                                                  5https://www.16personalities.com/articles/reliability-and-
                      1https://www.myersbriggs.org/using-type-as-a-           validity
                                                                          1360
                                      Proceedings of Recent Advances in Natural Language Processing, pages 1360–1368
                                                                     Sep 1–3, 2021.
                                                      https://doi.org/10.26615/978-954-452-072-4_152
                  hoeven et al., 2016), and Japanese (Yamada et al.,       it is known that many people have characteristics of
                  2019); English posts collected from Personality          both polarities across MBTI dimensions (Pittenger,
                               6                        7
                  Cafe forum available in Kaggle; and English              1993), such ﬁltering of training datasets might lead
                                                 ´      ˇ
                  Reddit comments (Gjurkovic and Snajder, 2018;            to better performances of automatic systems for
                            ´
                  Gjurkovic et al., 2020). Despite being trained on        MBTIdetection from texts by removing noise.
                  large amounts of textual data (over one million),
                  andmodelledasfourbinaryclassiﬁcationtasks,the            2    Related Work
                  best systems performed only slightly better than
                  the randomandmajority-classbaselines, regardless         Plank and Hovy (2015) were the ﬁrst to explore
                  of the architecture used.                                the use of Twitter data for obtaining a large-scale
                     Some studies suggested that tweets might not          dataset for open-vocabulary automatic detection
                  contain sufﬁcient amounts of MBTI signals (even          of MBTI personality traits. They collected a cor-
                  after concatenating up to 150-200 tweets per user)       pus of 1.2M English tweets automatically labelled
                  due to the nature of Twitter posts (Celli and Lepri,     for gender and MBTI type. To identify the users
                          ˇ                                                for whom an MBTI type can be automatically as-
                  2018; Stajner and Yenikent, 2020, 2021). An-             signed, the authors relied on mentions of any of
                  other issue with all those studies and obtained          the 16 MBTI types plus the word “Briggs”. Addi-
                  results might be that the systems are supervised         tionally, each user was labelled as female or male
                  and were trained with gold labels obtained via           whenever it was discernible; those users for whom
                  MBTI questionnaires that suffer from all earlier         the gender was not discernible were excluded from
                                                                ˇ
                  mentionedweaknesses. Inourrecentstudy(Stajner            the study. For each selected Twitter user, the au-
                  and Yenikent, 2021), we found a low association          thors collected up to 2000 most recent tweets (to be
                  between the MBTI types obtained via question-            included, each user had to have at least 100 tweets).
                  naires and the MBTI signals found in the short           Plank and Hovy (2015) found that the distribution
                  texts written by participants (tweets and free texts     of MBTI types across the selected Twitter users
                  oncarefully chosen topics). At the same time, the        signiﬁcantly differs from the distribution of MBTI
                  inter-annotator agreement of two expert annotators       types across the general US population. The au-
                  assigning MBTI types based on those free texts           thors further trained binary classiﬁcation models
                                   ˇ
                  wasquite high (Stajner and Yenikent, 2021).              (for each MBTI dimension separately) using vari-
                     Contributions. To avoid all previously men-           ous features and model architectures. The best sys-
                  tioned problems in automatic MBTI detection from         tems outperformed majority-class baselines only
                  texts, in this study, we propose a carefully designed    for I/E and T/F dimensions.
                  set of four questions with answers on a 1-5 scale           Verhoeven et al. (2016) used a similar strategy
                  (Section 3) that aim to capture the main MBTI            for obtaining large-scale MBTI datasets for six
                  characteristics without taking much time from par-       other languages: German, Italian, Dutch, French,
                  ticipants, and can be administered together with         Portuguese, and Spanish. As opposed to the work
                  any open-end questions without need for trained          of Plank and Hovy (2015), the triggers for identify-
                  human assessors. The validity of our question-           ing users whose MBTI types can be automatically
                  naire has been assessed via expert human anno-           assigned were mentions of one of the 16 personal-
                  tation following previously proposed annotation          ity types and the word “personality” or pronouns
                                  ˇ
                  methodology (Stajner and Yenikent, 2021). The            andverbformssuchas“Iam”or“Ihave”,foreach
                  agreement between the answers to the newly pro-          of the six languages. All retrieved contexts were
                  posed questions and the expert human annotations         manually checked for whether or not they describe
                  wasfoundtobesimilar as between two trained an-           the personality of the writer of the post. For all
                  notators (Section 5.2). Another advantage of the         users whose posts passed this check, the gender
                  proposed method is that it goes beyond binary ty-        was annotated based on the user’s name, handle,
                  pology, by offering a 5-point scale for each MBTI        description, and proﬁle picture (Verhoeven et al.,
                  dimension. This creates a possibility for ﬁltering       2016). Distributions of MBTI types across Twitter
                  out those instances written by people who exhibit        users of the six languages were found to be similar,
                  similar amount of signals from both polarities. As       with only a few exceptions (Verhoeven et al., 2016).
                     6https://www.personalitycafe.com/                     Theauthors also trained binary classiﬁers using the
                     7https://www.kaggle.com/datasnaek/mbti-type           dataset with 200 concatenated tweets for each user
                                                                       1361
                  and LinearSVC classiﬁer with binary word and
                  character n-gram features. Similar as for English
                  (Plank and Hovy, 2015), in most of the languages,
                  the best classiﬁers outperformed the majority-class
                  baselines only for E/I and T/F dimensions.
                              ´      ˇ
                     Gjurkovic and Snajder (2018) compiled a large-
                  scale MBTIdatasetfromEnglishRedditcomments
                  by relying on ﬂairs—short introductions of users
                  on various subreddits—which, in the case of the
                  MBTI-related subreddits, usually contain the users’
                                                                      ´
                  MBTIresults. In the subsequent study (Gjurkovic
                  et al., 2020), dataset was further enriched with de-
                  mographic information about the users (age, gen-
                  der, location, and language), and the labels for two
                  otherpersonalitymodels. ThedistributionofMBTI
                  typesinthisdatasetalsosigniﬁcantlydeviatedfrom
                  the general US population (see Figure 3 in Sec-
                  tion 6 for comparison of MBTI type distribution
                  amongdifferent populations/datasets).
                     Automatic assignment of MBTI type to each
                  user in all above-mentioned studies is based on
                  automatic extraction of contexts in which a cer-                   Figure 1: Demographic questions.
                  tain MBTI type is mentioned.          Without man-
                  ual inspection of each such mention—which was            via popular questionnaires), which might be an
                  only reported for the study by Verhoeven et al.          indication that MBTI results obtained via question-
                  (2016))—the assigned labels might not be reliable,       naires do not resonate well with the MBTI signals
                  as they may refer to someone else mentioned in the       found in more natural textual forms.
                  tweet and not the writer of the tweet, or they might        Thecurrent study aims to overcome previously
                  be a part of a larger phrase, e.g. “I think/believe I    reported issues by proposing four questions with
                  amanINTP”or“IexpecttogetESFJastheresult                  the answers on a 1–5 scale to obtain MBTI labels
                  if I do personality assessment”.                         that better resonate with the expert human MBTI
                     Tothebest of our knowledge, the only study in         annotations on short texts.
                  whichMBTIlabelswereobtainedbyexplicitlyask-
                  ing participants to report their MBTI type, if they      3    Questionnaire
                  had done an MBTI personality test in the past, is
                                     ˇ                                     Thewholequestionnaire consisted of one optional
                  our recent study (Stajner and Yenikent, 2021). The
                  AmazonMechanicalTurkworkerswerealsoasked                 question “YoumighthaveobtainedyourMBTItype
                  to describe their favourite type of vacations and        in the past via questionnaires. If you know your
                  preferred hobbies in minimum 300 characters each.        MBTItype,please type it here”, four compulsory
                  Wefoundthatthis type of texts (responses to care-        demographic questions, four compulsory questions
                  fully selected open-end questions) contain more          with answers on a 1–5 scale that aimed to capture
                  MBTI signals than tweets (even if concatenated           the participants MBTI type, and two compulsory
                  together for each user). We further proposed de-         open-end questions. Demographic questions en-
                  tailed guidelines for MBTI personality annotation        compassed gender, age, whether or not English is
                  from textual data, and showed that expert human          their native language, and the highest level of ed-
                  annotators have a high level of agreement among          ucation obtained (Figure 1). The gender question
                  themselves on obtained textual answers when fol-         had four possible answers: female, male, other,
                  lowing provided guidelines. At the same time, we         prefer not to specify. Five age groups were offered
                  found that the annotators have a low level of agree-     to choose from: 18–25, 26–35, 36–45, 46–55, and
                  ment with the MBTI types reported by participants        over 55.
                  (based on their previous MBTI personality testing           After answering demographic questions, partici-
                                                                       1362
                                                                           intuitive, by asking whether they prefer technical
                                                                           andhands-onhobbies(1=sensing)orabstractand
                                                                           imaginative (5 = intuitive). The third MBTI di-
                                                                           mension (T/F) is fundamentally about how people
                                                                           maketheir decisions, whether based on rational or
                                                                           emotional motives. As people do not engage with
                                                                           strict decision-making processes during their free
                                                                           time, which is ultimately based on their personal
                                                                           interests, the question measured the preference for
                                                                           rational (1 = thinking) or emotional (5 = feeling)
                                                                           reasoning for liking a certain hobby. The fourth
                                                                           question aimed to capture the preference for spon-
                                                                           taneous and ﬂexible (1 = perceiving), or a well-
                                                                           planned (5 = judging) schedule at vacations.
                                                                              We initially prepared two questions per each
                                                                           MBTIdimensionandperformedapilotstudywith
                                                                           30participants to choose those questions (Figure 2)
                                                                           that better correspond to the MBTI types provided
                                                                           bythe participants, and the MBTI annotations by
                                                                           two annotators.
                                                                              Finally, participants were asked to answer to two
                      Figure 2: Questions for obtaining MBTI labels.       open-end questions, which we previously proposed
                                                                            ˇ
                                                                           (Stajner and Yenikent, 2021) as the optimal ques-
                                                                           tions for annotating MBTI types from texts:
                  pants were provided with four questions that aimed
                  to capture their MBTI type, and were asked to pro-           • Describe which kind of vacations you typi-
                  vide an answer on a 1–5 points scale. Those four               cally enjoy and why.
                  questions are the central contribution of this study.        • Describe what type of hobbies you enjoy and
                  Byfollowing the idea that aspects of leisure time              why.
                  represent the most natural version of personality, as
                  it is directed by high degrees of intrinsic motivation   The two questions were preceded by the follow-
                   ˇ                                                       ing instructions: “The following questions aim to
                  (Stajner and Yenikent, 2021), the questions are fo-
                  cussed on typical leisure time activities—hobbies        understand your life style preferences. While an-
                  and vacations. This also gave us the opportunity         swering, please write down the ﬁrst things that
                  to utilize the previously proposed open-end ques-        cometoyourmindwithout much contemplation.”
                         ˇ                                                 To be accepted, each answer needed to contain a
                  tions (Stajner and Yenikent, 2021) in the validation
                  process (Section 5). In deciding the content of          minimumof300characters.
                  the questions for each individual dimension, we          4    Challenges in Data Collection
                  followed the main deﬁnitions provided by Briggs-
                  Myers and Myers (1995). Although each MBTI               Data was collected via Amazon Mechanical Turk
                  dimensioncorrespondstomultiplepracticalandbe-            (AMT) platform. We prepared the questionnaire
                  havioral characteristics, the core theoretical focus     as Google Forms and provided the link to it in
                  for every dimension is consistent.                       the HIT of the AMT platform. We experimented
                     The ﬁrst question (for the E/I dimension) was         withvarioussetupsintheplatform: differentvalues
                  designed with the idea of capturing whether the          for monetary compensations, allowing only those
                  person prefers to be surrounded by people and            participants with high scores on previous tasks,
                  social interactions, on one end of the scale (1 =        different times for validation of the answers and
                  extraverted), or to spend quiet and calm time by         payment. The only variable that noticeably inﬂu-
                  themselves, on the other end of the scale (5 = in-       enced the time needed for obtaining completed
                  troverted). The second question (for the S/N di-         HITs was whether or not we restrict the partici-
                  mension) aims to capture the characteristics of the      pants according to their performance on the pre-
                  tasks people would prefer to process, concrete or        vious HITs. Without any restrictions, we were
                                                                       1363
The words contained in this file might help you see if this file matches what you are looking for:

...Howtoobtainreliablelabelsformbticlassicationfromtexts sanja stajner seren yenikent symantoresearch nuremberg germany symanto com abstract popularity of mbti framework it is estimated that automaticdetectionofthemyers briggstype morethanmillion us adults complete the inven indicator from short posts attracted tory every year there a number freely avail noticeable attention in last few years re able alternative questionnaires on inter cent studies showed this quite dif net with personalities test being one cult task especially commonlyusedtwitter most popular ones according to myers briggs data obtaining labels also difcult as human annotation requires trained psychol foundation and personality website ogists automatic way them both satisfy accepted standards through long questionable for validity reliability nevertheless usability paper we present have received method collecting reliable criticism academic community pittenger via only four carefully selected questions boyle not relying ...
Related files

Share

Help

Related files

Share

Share to social media

Help

Login Area