2022 CIDA - Presentation

Author

Naim Çınar, Züleyha Özbaş Anbarlı

Exploring Political Polarization in Femicide Discourse through Social Network Analysis: Relationships in Polarizations in the Pınar Gültekin Case

Web page where you can follow the presentation content:

qr-code-webpage

1 Introduction

After Pınar Gültekin went missing on July 16, 2020, her family, friends, and women’s movement organizations began efforts to find her, especially by using social media platforms. During this period, Twitter was used very actively.

Key dates in the process:

  • On July 21, 2020, it was revealed that she had been killed by a man named Cemal Metin Avcı.

  • In July 2020, debates around the Istanbul Convention intensified (there was a strong reaction on social media to Abdurrahman Dilipak’s July 27 column titled “AKP’nin papatyaları”).

  • The first hearing took place on November 9, 2020.

  • On March 20, 2021, Türkiye withdrew from the Istanbul Convention as a result of the opposing campaigns.

  • At the 13th hearing on June 20, 2022, a sentence of aggravated life imprisonment was first issued, then reduced to 23 years with “unjust provocation” mitigation.

1.1 Research Questions

  1. Between December 27, 2020 and January 3, 2021, what kind of network do users who tweet about the Pınar Gültekin murder form?
    • Who are the main actors in the network?

    • Who dominates the network? Which actors form clusters?

    • Who are the actors around the main actors?

    • Which political pole do actors belong to?

  1. Between December 27, 2020 and January 3, 2021, what is most frequently discussed in tweets about the Pınar Gültekin murder?

1.2 Data Collection

Using Twitter API v2 Academic Research access level, the data was collected in R (via the academictwitteR package).

Academic research level: monthly upper limit of 10 million requests (Full archive search, Full archive tweet count). In other levels (Essential, Elevated) only the last week of data can be accessed.

1.2.1 Distribution of tweet counts (time series plot)

$title
[1] "Saatte atılan tweet sayısı"

attr(,"class")
[1] "labels"

Distribution by tweet type:

Compare Tweet Types

1.2.2 Accessing all tweets

Code
get_all_tweets(
  query = c("Pınar Gültekin", "#PınarGültekin", "#PınarGültekinİçinAdalet"),
  start_tweets = "2020-07-21T10:00:00Z",
  end_tweets = "2022-05-05T10:00:00Z",
  lang = "tr",
  file = "pinargultekin",
  data_path = "pg-tweet-data",
  n = 632200,
)

Total number of tweets in the dataset:

Code
nrow(joined)
[1] 478278

Dataset columns:

Code
colnames(joined.clean)
 [1] "id"                     "created_at"             "retweet_count"         
 [4] "like_count"             "quote_count"            "url"                   
 [7] "hashtag"                "mention_username"       "mention_id"            
[10] "sourcetweet_type"       "sourcetweet_id"         "sourcetweet_text"      
[13] "sourcetweet_author_id"  "text"                   "possibly_sensitive"    
[16] "author_id"              "user_username"          "user_name"             
[19] "user_description"       "user_profile_image_url" "user_url"              
[22] "user_verified"          "user_location"          "user_followers_count"  
[25] "user_following_count"   "user_tweet_count"       "user_list_count"       
[28] "source"                 "in_reply_to_user_id"   

Quoted, Retweet, Unique (including reply-to) tweet counts:

Code
joined %>%
  count(sourcetweet_type, name = "frequency")
  sourcetweet_type frequency
1           quoted      7620
2        retweeted    412642
3             <NA>     58016

Interactive data table created for the large dataset:

Twitter Data - Reactable Data Table

1.3 Social Network Analysis

1.3.1 Definition

  • An approach based on examining interactions among social actors.

  • It is grounded in graph theory, a branch of mathematics.

  • Graph theory examines graphs, a mathematical representation of objects and the relationships among them. Graphs consist of nodes (node, unit, vertex) and edges (edge, line, tie, link).

  • In its simplest form, a graph is an edge list where nodes appear in two columns.

1.3.2 Centrality measures

  • Actors (nodes) in a social network can take different structural positions and can affect the flow of information in different ways and at different levels.

  • Centrality measures make these positions visible.

Degree-based centrality measures Shortest-path-based centrality measures
Degree centrality Betweenness centrality
Eigenvector centrality Closeness centrality
  • Degree centrality (degree, in-degree, out-degree) depends on whether the network is directed or undirected. Twitter can be analyzed as both directed and undirected; Facebook is typically undirected.

  • Degree centrality counts all neighbors equally; what matters is the number of neighbors.

  • In eigenvector centrality, a node becomes more important if it is connected to important (highly connected) nodes.

  • Betweenness centrality identifies which nodes are important in the flow of the network using shortest paths; it counts how many shortest paths pass through each node.

  • Closeness centrality also uses shortest paths and computes a node’s average distance to all other nodes; the smaller the distance, the more central the node.

1.3.3 Preparing the data for analysis

After a cleaning step, the total number of tweets in our dataset is: 443811

Graphs, respectively: retweet, quoted, replyto, mentions, whole

This graph was created by an old(er) igraph version.
ℹ Call `igraph::upgrade_graph()` on it to use with the current igraph version.
For now we convert it on the fly...
IGRAPH da1878f DN-- 182668 412615 -- 
+ attr: name (v/c), device (e/c), relationship_type (e/c),
| created_at_round (e/n)
This graph was created by an old(er) igraph version.
ℹ Call `igraph::upgrade_graph()` on it to use with the current igraph version.
For now we convert it on the fly...
IGRAPH 762b833 DN-- 7301 7615 -- 
+ attr: name (v/c), device (e/c), relationship_type (e/c),
| created_at_round (e/n)
This graph was created by an old(er) igraph version.
ℹ Call `igraph::upgrade_graph()` on it to use with the current igraph version.
For now we convert it on the fly...
IGRAPH 476230f DN-- 6416 6962 -- 
+ attr: name (v/c), device (e/c), relationship_type (e/c),
| created_at_round (e/n)
This graph was created by an old(er) igraph version.
ℹ Call `igraph::upgrade_graph()` on it to use with the current igraph version.
For now we convert it on the fly...
IGRAPH bd55e4e DN-- 14226 16619 -- 
+ attr: name (v/c), device (e/c), relationship_type (e/c),
| created_at_round (e/n)
This graph was created by an old(er) igraph version.
ℹ Call `igraph::upgrade_graph()` on it to use with the current igraph version.
For now we convert it on the fly...
IGRAPH e607555 DN-- 194261 443811 -- 
+ attr: name (v/c), device (e/c), relationship_type (e/c),
| created_at_round (e/n)

1.3.4 Number of nodes and edges in the network created for the selected time window

Retweet graph:

This graph was created by an old(er) igraph version.
ℹ Call `igraph::upgrade_graph()` on it to use with the current igraph version.
For now we convert it on the fly...
IGRAPH 0b5c565 DN-- 15719 35851 -- 
+ attr: name (v/c), device (e/c), relationship_type (e/c),
| created_at_round (e/n)

Quoted graph:

This graph was created by an old(er) igraph version.
ℹ Call `igraph::upgrade_graph()` on it to use with the current igraph version.
For now we convert it on the fly...
IGRAPH eafd0da DN-- 363 296 -- 
+ attr: name (v/c), device (e/c), relationship_type (e/c),
| created_at_round (e/n)

Reply-to graph:

This graph was created by an old(er) igraph version.
ℹ Call `igraph::upgrade_graph()` on it to use with the current igraph version.
For now we convert it on the fly...
IGRAPH 8c39ba7 DN-- 705 758 -- 
+ attr: name (v/c), device (e/c), relationship_type (e/c),
| created_at_round (e/n)

Mentions graph:

This graph was created by an old(er) igraph version.
ℹ Call `igraph::upgrade_graph()` on it to use with the current igraph version.
For now we convert it on the fly...
IGRAPH 93b6552 DN-- 1044 1316 -- 
+ attr: name (v/c), device (e/c), relationship_type (e/c),
| created_at_round (e/n)

Whole graph:

This graph was created by an old(er) igraph version.
ℹ Call `igraph::upgrade_graph()` on it to use with the current igraph version.
For now we convert it on the fly...
IGRAPH 97730e8 DN-- 16601 38221 -- 
+ attr: name (v/c), device (e/c), relationship_type (e/c),
| created_at_round (e/n)

1.3.5 Centrality measures for the whole network (whole_graph)

Out-degree (whole):

  denizschmosby     aktepeyekta         gmrrty3   fahri10698828       umitcan25 
            107              95              85              75              62 
     battalgaz3        ykilicer arslansabanrt11  hudutsuzmenzil        kaangucl 
             59              59              57              57              51 

In-degree (whole):

      yenisafak    themarginale       debuffer2       zekibahce      medyaadami 
           3007            1612            1567            1127             936 
ajanshaberresmi    asliaydincer        hurriyet   anadoluajansi     malikejder_ 
            903             879             855             853             801 

Eigenvector centrality (whole):

in-degree Eigenvector Outdegree
yenisafak 3007.0 1.0 3.0
debuffer2 1567.0 0.5294414939568258 4.0
themarginale 1612.0 0.484534292740911 0.0
malikejder_ 801.0 0.454546744192743 23.0
ajanshaberresmi 903.0 0.4239984681274763 11.0
zekibahce 1127.0 0.34175188184066685 4.0
medyaadami 936.0 0.29958486738959045 5.0
asliaydincer 879.0 0.26428676830335074 0.0
anadoluajansi 853.0 0.2591656138502106 0.0
hurriyet 855.0 0.25872961070801237 0.0
enveryan 763.0 0.24934838027922762 2.0
fatmanuraltun 622.0 0.20846019997440765 0.0
neslihan3029 619.0 0.2027788949827502 6.0
slmhktn 600.0 0.2014278125610776 4.0
fatihtezcan 655.0 0.19705698932603072 2.0
yazparov 648.0 0.19666719949604478 2.0
avicenna_razi 603.0 0.18523846428714535 0.0
umutmurare 564.0 0.1759469537165706 6.0
eha_medya 383.0 0.16208797670774444 7.0
ferayicinadale1 429.0 0.14402642998342383 3.0
cnnturk 417.0 0.13016569021344046 0.0
akantalyali 396.0 0.12702158934500543 5.0
nuhalbayrak 421.0 0.126683476034561 2.0
cakiciefe1453 418.0 0.12614186875642203 1.0
blrcano0o_ 391.0 0.12094855432093157 0.0
manidar_hayat 361.0 0.11274778908590867 0.0
yilmazgul35351 353.0 0.10700030294467901 1.0
tasdemir_cemile 351.0 0.10583156270356196 0.0
thelaikyobaz 348.0 0.10460126341451588 1.0
herkesicinchp 188.0 0.10329376786893778 0.0
drhuriyet 327.0 0.09980553210814692 22.0
mediamuhtari 322.0 0.0968945716676449 3.0
bayanteror 297.0 0.08921523645965673 1.0

Closeness (whole):

aahmetterdogann ibrahim61966688      dilek__rte       mondstern         ikiyaka 
    0.001512859     0.001517451     0.001517451     0.001519757     0.001522070 
      dilikedi1     diiek__rte_   enimenelegzet      ismhndncsy         dondu_e 
    0.001522070     0.001547988     0.001577287     0.001602564     0.001626016 

Betweenness (whole):

ferayicinadale1   denizschmosby     malikejder_       zekibahce   sanli_turk___ 
      24653.522       23747.595       12569.292        9507.165        6389.786 
      drhuriyet belkibirgun2335 hayriyeberberl1  kampuscadilari      umutmurare 
       5784.300        5271.323        4713.151        3443.000        3264.622 

1.3.5.1 Most informative centrality measure

To determine which centrality measure is the most informative about the network, we use Principal Component Analysis (PCA).

PCA is a dimensionality reduction technique used in linear analysis.

The analysis was conducted using the CINNA package in R.

1.3.6 Centrality measures in the retweet network

Out-degree (retweet):

    aktepeyekta         gmrrty3   fahri10698828       umitcan25      battalgaz3 
             95              85              75              62              59 
arslansabanrt11  hudutsuzmenzil        ykilicer        kaangucl     fedaimalkoc 
             57              57              55              51              49 

In-degree (retweet):

      yenisafak    themarginale       debuffer2       zekibahce      medyaadami 
           2988            1608            1549            1116             931 
ajanshaberresmi    asliaydincer   anadoluajansi        hurriyet     malikejder_ 
            890             873             829             816             786 

Closeness (retweet):

   borankaplan6     bahgunaydin       ahmt_gzel     nur26139681      tevfikaliz 
    0.006024096     0.006493506     0.006535948     0.006622517     0.006896552 
urfaliogluihsan         meb6307        ilkatman      yusra__571       sskartal3 
    0.006896552     0.006944444     0.006944444     0.006993007     0.006993007 

Betweenness (retweet):

1.3.7 Centrality measures in the quoted network

Out-degree (quoted):

politicalinnov2       aahmttprk      emtevbrane yildizdilek2009   unsalkartal58 
             21               5               5               4               4 
 muradcobanoglu     albay_birol       dcankocak    ahmettozlu29 yunus_arslan_ya 
              3               3               2               2               2 

In-degree (quoted):

  sgirgin48tbmm        hurriyet     devapartisi       yenisafak       debuffer2 
             31              19              15              14               9 
  anadoluajansi ajanshaberresmi      enginozkoc   unsalkartal58          takvim 
              9               8               5               4               4 

Closeness (quoted):

yildizdilek2009       aahmttprk      emtevbrane politicalinnov2    ahmettozlu29 
      0.1666667       0.2000000       0.2000000       0.2000000       0.2500000 
      insanvard   bozdagbulentt         29ercan        aygun_tk        yahreynn 
      0.2500000       0.3333333       0.3333333       0.3333333       0.3333333 

Betweenness (quoted):

      debuffer2 yankibuyuksezer      medyaadami        enveryan      ersinceliq 
              9               1               1               1               1 
 biriktisatci11    gurler_rustu       dcankocak yildizdilek2009    connectumkut 
              1               0               0               0               0 

1.3.8 Centrality measures in the reply-to network

Out-degree (reply-to):

  denizschmosby         orca34o    vatan66sever    okan54359803     malikejder_ 
             40              20              19              16              15 
       tumham11     murat202202 belkibirgun2335       sedatkck3     1071fatihan 
             14              13              12              12              11 

In-degree (reply-to):

 kilicdarogluk  sgirgin48tbmm  herkesicinchp   eczozgurozel     enginozkoc 
            53             39             29             28             18 
   malikejder_    gazetesozcu canan_kaftanci  cumhuriyetgzt       alimahir 
            15             13             11              8              8 

Closeness (reply-to):

belkibirgun2335   denizschmosby       yunlu1905         orca34o    vatan66sever 
     0.01694915      0.02777778      0.04000000      0.04166667      0.05555556 
    murat202202    okan54359803   ziyahafizoglu       sedatkck3    cerkes_giray 
     0.08333333      0.08333333      0.10000000      0.11111111      0.11111111 

Betweenness (reply-to):

  denizschmosby    okan54359803       zekibahce belkibirgun2335        yazparov 
           26.0            12.0             4.5             3.0             3.0 
    akantalyali      medyaadami     murat202202      miraataba1         slmhktn 
            2.0             1.5             0.0             0.0             0.0 

1.3.9 Centrality measures in the mentions network

Out-degree (mentions):

  denizschmosby    bizimtvcomtr    vatan66sever        kpopabla    okan54359803 
             67              32              26              23              22 
        orca34o     murat202202 belkibirgun2335     1071fatihan     executive61 
             20              18              16              15              15 

In-degree (mentions):

 herkesicinchp  kilicdarogluk  sgirgin48tbmm   eczozgurozel   bizimtvcomtr 
           157             97             84             34             32 
   gazetesozcu     enginozkoc  cumhuriyetgzt       hurriyet canan_kaftanci 
            18             18             17             15             15 

Closeness (mentions):

belkibirgun2335       yunlu1905   denizschmosby         orca34o    vatan66sever 
     0.01639344      0.03125000      0.03333333      0.04166667      0.05263158 
   okan54359803 smailen89581578     murat202202        kpopabla    cerkes_giray 
     0.06666667      0.07142857      0.08333333      0.08333333      0.10000000 

Betweenness (mentions):

 denizschmosby   okan54359803    haberturktv      zekibahce  aydemirbulent 
          26.0           15.0            9.0            4.5            4.0 
      yazparov        slmhktn    akantalyali gultekindavasi     medyaadami 
           4.0            3.0            2.0            2.0            1.5 

1.4 Social network visualizations

Visualization link

1.4.1 Interpreting “unnatural-looking” relationships

Difference between troll and bot accounts:

Bot: automated social media accounts programmed to imitate human behavior.

  • Botometer, developed at Indiana University by a team including Onur Varol (Sabancı University).

Troll: these accounts are controlled by humans (they can act individually or as coordinated groups).

  • Time-based correlation (retweets happening at similar times right after the original tweet)

  • Account activity (e.g., user_tweet_count)

  • Content similarity (e.g., accounts that continuously retweet)

Similarly, a recent Harvard University study on identifying troll accounts (Detecting Troll, Saving Democracy) uses:

  • Content (detecting frequently repeated word groups using techniques similar to our word analysis)

  • Followers

  • Following

  • Retweet count

1.4.2 Example: identifying suspicious interactions

Information on which devices tweets were sent from (device type as an edge attribute):

device-type

Accounts that retweeted TheMarginale’s tweet (device type not Android, iPhone, or iPad):

themarginale-tweet

Filtered data:

Code
st1.joined.clean <- subset(joined.clean, created_at> "2020-12-27T00:00:18" & created_at < "2021-01-02T00:23:18")

st1.joined.clean.filtered <- st1.joined.clean%>%
  filter(st1.joined.clean$source == "Twitter Web App") 

st1.joined.clean.filtered <- st1.joined.clean.filtered%>%
  filter(st1.joined.clean.filtered$sourcetweet_id == "1343167361466191874")

Time series analysis for the filtered data:

Screenshots of accounts that retweeted the relevant tweet:

themarginale-retweet-accounts

themarginale-retweet-accounts-tweet-counts

1.5 Automated text analysis

1.5.1 Text cleaning steps

In order: | ========================================================================================================================================+ 1. Converting some special characters and Turkish uppercase letters to Latin letters (e.g., ‘α’ = ‘a’, ‘á’ = ‘a’, ‘é’ = ‘e’, ‘Ü’ = ‘u’) | 2. Converting uppercase to lowercase | | 3. Removing Turkish stopwords (e.g., “ve”, “şuna”, “tamam”, “yine”… 473 words) | | 4. Removing some special characters (e.g., removepunctuation, removenumbers, removehashtags, removeurl…) | | 5. Converting Turkish characters to Latin equivalents |

1.5.2 Most frequent words

              word    n
1            pınar 4760
2       gültekinin 2827
3         gültekin 2514
4          davadan 2247
5              chp 2128
6            chpli 1896
7            muğla 1457
8           babası 1121
9     milletvekili 1097
10        süleyman  938
11          vazgeç  870
12        babasını  756
13          girgin  722
14        vazgeçin  611
15        ailesine  562
16           vekil  518
17           diyen  495
18 milletvekilinin  489
19        arayarak  487
20      katledilen  446

1.5.3 Word cloud

1.5.4 Most frequent emojis

# A tibble: 10 × 2
   emoji     n
   <chr> <int>
 1 😡       44
 2 🔴       34
 3 ❗       29
 4 🔹       25
 5 🔥       23
 6 👇       22
 7 ▪️        18
 8 📌       17
 9 💣       16
10 🤬       16

1.6 Skip-gram model

Splitting text into smaller parts with n-grams and skip-grams enables examining correlations and the context around words.

  • An n-gram is a sequence of n adjacent items (here, items are words) in an example text.

  • The value n indicates how many items we split the text into. If n = 1 it is a “unigram”; if n = 2, a “bigram” (two consecutive words); if n = 3, a “trigram”.

  • n-gram models are frequently used in natural language processing (NLP) to predict the next word/text.

  • For k-skip-n-grams, n indicates the number of items (words) and k indicates how many skips are allowed.

  • Therefore, an n-gram (with no skips) is the same as a 0-skip-n-gram.

  • The skip-gram model is an unsupervised learning technique used to identify contextually related surrounding words for a given word in a text.

1.6.1 Network visualization

sp1-skipgram-network

1.6.2 Clusters from the “all-time” text analysis

 [1] "gültekin, pınar, katledilen, öğrencisi, yeni, cinayetinde, vahşice, üniversite, öldürülen, gültekini, cinayeti, davasında, son, katleden, flaş"                                  
 [2] "cemal, metin, katil, mertcan, zanlısı, kardeşi, avcı, tahliye, avcının, sanık, cma, cüce, muğladaki, isimli, mekanın"                                                            
 [3] "kadınların, pınargültekin, önceki, kişiyi, etiketler, sesiyim, yazıp, etiketleyebildiğiniz, çiçek, yeter, istanbulsözleşmesiyaşatır, çek, üzerinden, kadınasiddetedurde, misiniz"
 [4] "gültekinin, davadan, katili, babasını, arayarak, vazgeç, sıddık, babası, ailesinin, ailesine, arayıp, rezan, cansız, diyen, avukatı"                                             
 [5] "emine, ozgecan, sule, münevver, aleyna, ceren, helin, cet, güleda, bulut, aslan, cakır, karabulut, ozdemir, oldürüldü"                                                           
 [6] "süleyman, chp, muğla, chpli, milletvekili, ağır, suç, yönetim, katilin, ilçe, iddianame, hakkında, ceza, gündür, ailesi"                                                         
 [7] "ortaya, ifadesi, görüntüleri, çıktı"                                                                                                                                             
 [8] "pinargultekin, adalet, pınargültekiniçinadalet, gerçek, erkek, istiyoruz, eski, tweet, imza, kampanyaya, arkadaşı, pınargueltekinicinadalet, yerini, sevgilisi, atın"            
 [9] "kadın, reddi, hakim, öldü, ülkede, cinayetleri, kız, yakılarak, talebi, insanlık, koklamaya, diyor, cesedi, külünü, öpüp"                                                        
[10] "adli, otopsi, tıp, raporu"                                                                                                                                                       
[11] "diri, yakılmış, yakıldığı"                                                                                                                                                       
[12] "üzerine, üstüne, beton, dökülen, varile, koyup, döken, dökülmüş, konup, dökülerek"                                                                                               
[13] "kan, donduran"                                                                                                                                                                   
[14] "allahtan, allah, rahmet, belanızı, versin, eylesin"                                                                                                                              
[15] "bağ, keşif, evinde, yapılacak, evine"                                                                                                                                            
[16] "cinayete, kurban, giden"                                                                                                                                                         
[17] "ört, bas, etmeye, etmek, isteyen, pis, çalıştı, ellerini"                                                                                                                        
[18] ""                                                                                                                                                                                
[19] ""                                                                                                                                                                                

1.6.3 Text analysis – network visualization

667,875 bigram pairs (created via skip-gram analysis)

Code
nrow(skip.gram.count)
[1] 667875

Network visualization link