RA Task

Author

Ming Cui

Data Task

Background

The dataset comes from a survey where 8688 participants from 38 different countries completed two key questionnaires. You can see participants’ nationalities using the “country” variable at the end of the dataset.

In the first questionnaire, participants rated their level of concern from 1 (“Not at all concerned”) to 5 (“Very concerned”) about 10 different threats facing their society. These threats (and their variable names) were:

  • Natural disaster (SocietalThreat_1)
  • Famine or drought (SocietalThreat_2)
  • Pollution (SocietalThreat_3)
  • Terrorism (SocietalThreat_4)
  • Crime surge (SocietalThreat_5)
  • Discrimination (SocietalThreat_6)
  • Increase in national debt (SocietalThreat_7)
  • A surge of COVID-19 cases (SocietalThreat_8)
  • Influx of refugees (SocietalThreat_9)
  • Illegal immigration (SocietalThreat_10).

In the second questionnaire, participants rated their desire for more “cultural tightness”–more restrictive cultural norms in their society–using the 9-item scale developed by Jackson and colleagues (2019, Plos One). Higher scores on the 1-7 scale represented a greater desire for cultural tightness (the variables corresponding to this scale are “CT_1” - “CT_9”). None of the items are reverse-coded.

Task

Using this dataset, we would like you to prepare and upload a data summary as a pdf or word document. You do NOT need to submit code, and you can use any software that you like to analyze the data. Your data summary should have five sections:

  1. A figure which visualizes the average level of concern for each threat, assembled in a way that the reader can clearly see which threats are more concerning and which threats are least concerning.

  2. A brief written summary of whether or not the desire for cultural tightness scale is reliable.

  3. A table which communicates the correlation between desire for cultural tightness and level of concern for each threat (there should be 10 correlations in this table, one for each threat variable).

  4. A written summary of a regression model which summarizes the association between concern about a rise of COVID-19 cases and desire of cultural tightness. In this summary, state the association’s effect size and statistical significance.

  5. A figure which shows the association between COVID-19 concern and desire for cultural tightness (e.g., a scatterplot) for three different countries.

Please end the document with a concluding paragraph which summarizes your inferences about the relationship between perceived societal threat(s) and desire for cultural tightness. Longer submissions are not necessarily better. The submission does not need to be longer than 3 pages.

My Code

Average level of concern for each threat

library(pacman)
p_load(viridis, # platte
       tidyverse,
       gt,
       ggExtra, # for hist plot
       rstatix, # cor_mat
       lavaan, # cfa
       gghalves, # for half plots
       semTools # reliability
       )
RA_task_df <- read_csv("Dataset.csv")
Rows: 8688 Columns: 25
-- Column specification --------------------------------------------------------
Delimiter: ","
chr  (1): country
dbl (24): Age, Native, Education, Income, Religion, CT_1, CT_2, CT_3, CT_4, ...

i Use `spec()` to retrieve the full column specification for this data.
i Specify the column types or set `show_col_types = FALSE` to quiet this message.
temp1 <- RA_task_df |> 
  select(SocietalThreat_1:SocietalThreat_10) |> 
  pivot_longer(
    cols = starts_with("S"),
    names_to = "social_threat",
    values_to = "value"
  )

means <- aggregate(value ~  social_threat, temp1, mean)

means <- means |> 
  arrange(desc(value)) |> 
  mutate_if(is.numeric, ~round(., 2))

means
       social_threat value
1   SocietalThreat_8  4.05
2   SocietalThreat_3  4.04
3   SocietalThreat_5  4.04
4   SocietalThreat_7  3.88
5   SocietalThreat_6  3.78
6   SocietalThreat_1  3.76
7   SocietalThreat_4  3.76
8   SocietalThreat_2  3.61
9  SocietalThreat_10  3.54
10  SocietalThreat_9  3.39

1. A figure which visualizes the average level of concern for each threat, assembled in a way that the reader can clearly see which threats are more concerning and which threats are least concerning.

According to the graph below, “A surge of COVID-19 cases” is the most concerning, while “Influx of refugees” is the least.

RA_task_df |> 
  select(SocietalThreat_1:SocietalThreat_10) |> 
  pivot_longer(
    cols = starts_with("S"),
    names_to = "social_threat",
    values_to = "value"
  ) |> 
  ggplot(aes(x = reorder(social_threat, value),       # define x var
             y = value)) +
  geom_point(
    aes(color = social_threat), # we want different colors for each level of x
    position = position_jitter(width=.1), # add jitter to the observations
    size=.05, alpha=.1) +
  stat_summary(fun=mean,       # this indicates we want the mean statistic
               geom="point",   # we want the mean to be represented by a geom
               shape=21,       # use shape 21 (a circle with fill) for the mean
               size=1.5, col="black", fill="red") +
  geom_half_violin(aes(fill=social_threat),   # different colors for each level of x
                   bw=.45, side="r",                  # styling for the violin plot
                   position = position_nudge(x=.3)) +
  coord_flip() +
  theme_minimal() +
  geom_text(data = means, aes(label = value, y = value + 0.3)) +
  geom_half_boxplot(aes(fill=social_threat),  # different colors for each level of x
                    side="r", outlier.shape=NA, center=TRUE, # styling for boxplots
                    position = position_nudge(x=.15),        # position of boxplots
                    errorbar.draw=FALSE, width=.2) +
  theme(legend.position = "none") +
  scale_x_discrete(labels=rev(c('A surge of COVID-19 cases', 'Pollution', 'Crime surge', 'Increase in national debt', 'Discrimination', 'Natural disaster', 'Terrorism', 'Famine or drought', 'Illegal immigration', 'Influx of refugees'))) +
  xlab("Societal  Threats") + # for the x axis label
  ylab("Ratings") +
  labs(caption = "Raincloud plot combining a probability density function, jittered data points, 
       a mean represented by the red dot, and a box plot.") +
  theme(
  plot.caption = element_text(size = 10, hjust = 0.5)
)

2. A brief written summary of whether or not the desire for cultural tightness scale is reliable.

The results of confirmatory factor analysis below show that CFI and TLI > 0.9, RMSEA < 0.08, standardized factor loadings > 0.5, and alpha and omega > 0.7, which indicate that the one-factor model of tightness scale is acceptable.

# Specify model
f <- "
Tightness =~ CT_1 + CT_2 + CT_3 + CT_4 + CT_5 + CT_6 + CT_7 + CT_8 + CT_9
"

# Estimate model
cfa_tightness <- cfa(f, data=RA_task_df, std.lv=T, missing='direct', 
             estimator='MLR')

# The results can be viewed using the summary

summary(cfa_tightness, fit.measures=T, standardized=T)
lavaan 0.6.15 ended normally after 18 iterations

  Estimator                                         ML
  Optimization method                           NLMINB
  Number of model parameters                        27

  Number of observations                          8688
  Number of missing patterns                         1

Model Test User Model:
                                              Standard      Scaled
  Test Statistic                              1069.844     580.066
  Degrees of freedom                                27          27
  P-value (Chi-square)                           0.000       0.000
  Scaling correction factor                                  1.844
    Yuan-Bentler correction (Mplus variant)                       

Model Test Baseline Model:

  Test statistic                             24371.466   12716.850
  Degrees of freedom                                36          36
  P-value                                        0.000       0.000
  Scaling correction factor                                  1.916

User Model versus Baseline Model:

  Comparative Fit Index (CFI)                    0.957       0.956
  Tucker-Lewis Index (TLI)                       0.943       0.942
                                                                  
  Robust Comparative Fit Index (CFI)                         0.958
  Robust Tucker-Lewis Index (TLI)                            0.944

Loglikelihood and Information Criteria:

  Loglikelihood user model (H0)            -138140.012 -138140.012
  Scaling correction factor                                  1.318
      for the MLR correction                                      
  Loglikelihood unrestricted model (H1)    -137605.090 -137605.090
  Scaling correction factor                                  1.581
      for the MLR correction                                      
                                                                  
  Akaike (AIC)                              276334.023  276334.023
  Bayesian (BIC)                            276524.905  276524.905
  Sample-size adjusted Bayesian (SABIC)     276439.104  276439.104

Root Mean Square Error of Approximation:

  RMSEA                                          0.067       0.049
  90 Percent confidence interval - lower         0.063       0.046
  90 Percent confidence interval - upper         0.070       0.051
  P-value H_0: RMSEA <= 0.050                    0.000       0.822
  P-value H_0: RMSEA >= 0.080                    0.000       0.000
                                                                  
  Robust RMSEA                                               0.066
  90 Percent confidence interval - lower                     0.061
  90 Percent confidence interval - upper                     0.071
  P-value H_0: Robust RMSEA <= 0.050                         0.000
  P-value H_0: Robust RMSEA >= 0.080                         0.000

Standardized Root Mean Square Residual:

  SRMR                                           0.029       0.029

Parameter Estimates:

  Standard errors                             Sandwich
  Information bread                           Observed
  Observed information based on                Hessian

Latent Variables:
                   Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
  Tightness =~                                                          
    CT_1              0.826    0.022   37.963    0.000    0.826    0.517
    CT_2              0.807    0.020   39.627    0.000    0.807    0.508
    CT_3              0.907    0.021   43.349    0.000    0.907    0.532
    CT_4              0.961    0.020   49.111    0.000    0.961    0.610
    CT_5              1.279    0.018   69.375    0.000    1.279    0.705
    CT_6              1.189    0.017   68.531    0.000    1.189    0.705
    CT_7              1.161    0.017   69.970    0.000    1.161    0.720
    CT_8              1.001    0.019   53.601    0.000    1.001    0.631
    CT_9              1.136    0.018   63.768    0.000    1.136    0.692

Intercepts:
                   Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
   .CT_1              4.536    0.017  264.518    0.000    4.536    2.838
   .CT_2              4.143    0.017  243.420    0.000    4.143    2.612
   .CT_3              3.859    0.018  211.030    0.000    3.859    2.264
   .CT_4              4.473    0.017  264.742    0.000    4.473    2.840
   .CT_5              4.679    0.019  240.492    0.000    4.679    2.580
   .CT_6              4.839    0.018  267.306    0.000    4.839    2.868
   .CT_7              4.643    0.017  268.551    0.000    4.643    2.881
   .CT_8              4.541    0.017  266.620    0.000    4.541    2.860
   .CT_9              4.457    0.018  253.177    0.000    4.457    2.716
    Tightness         0.000                               0.000    0.000

Variances:
                   Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
   .CT_1              1.872    0.043   43.636    0.000    1.872    0.733
   .CT_2              1.866    0.037   50.047    0.000    1.866    0.741
   .CT_3              2.083    0.041   51.095    0.000    2.083    0.717
   .CT_4              1.556    0.037   41.628    0.000    1.556    0.627
   .CT_5              1.653    0.042   39.528    0.000    1.653    0.502
   .CT_6              1.433    0.036   39.787    0.000    1.433    0.503
   .CT_7              1.249    0.032   38.490    0.000    1.249    0.481
   .CT_8              1.518    0.036   41.610    0.000    1.518    0.602
   .CT_9              1.402    0.037   38.384    0.000    1.402    0.521
    Tightness         1.000                               1.000    1.000
reliability(cfa_tightness)
       Tightness
alpha  0.8520590
omega  0.8544411
omega2 0.8544411
omega3 0.8539080
avevar 0.4004140

3. A table which communicates the correlation between desire for cultural tightness and level of concern for each threat (there should be 10 correlations in this table, one for each threat variable).

RA_task_df <- RA_task_df |> 
  mutate(tightness = rowMeans(dplyr::select(RA_task_df, starts_with("CT"))))
library(apaTables)

RA_task_df |> 
  dplyr::select(tightness, starts_with("CT")) |> 
  apa.cor.table(attitude, filename="Table1_APA.doc", table.number=1)
Warning in if (show.conf.interval == FALSE) {: the condition has length > 1 and
only the first element will be used


Table 1 

Means, standard deviations, and correlations with confidence intervals
 

  Variable     M    SD   1          2          3          4          5         
  1. tightness 4.46 1.11                                                       
                                                                               
  2. CT_1      4.54 1.60 .60**                                                 
                         [.58, .61]                                            
                                                                               
  3. CT_2      4.14 1.59 .59**      .24**                                      
                         [.57, .60] [.22, .26]                                 
                                                                               
  4. CT_3      3.86 1.70 .61**      .27**      .29**                           
                         [.60, .63] [.25, .29] [.27, .31]                      
                                                                               
  5. CT_4      4.47 1.57 .67**      .41**      .32**      .33**                
                         [.66, .68] [.39, .43] [.30, .34] [.31, .35]           
                                                                               
  6. CT_5      4.68 1.81 .74**      .31**      .38**      .37**      .35**     
                         [.73, .75] [.29, .33] [.36, .39] [.35, .39] [.33, .37]
                                                                               
  7. CT_6      4.84 1.69 .73**      .33**      .34**      .35**      .39**     
                         [.72, .74] [.31, .35] [.32, .36] [.33, .36] [.37, .41]
                                                                               
  8. CT_7      4.64 1.61 .74**      .36**      .36**      .39**      .47**     
                         [.73, .75] [.34, .37] [.34, .37] [.37, .40] [.45, .48]
                                                                               
  9. CT_8      4.54 1.59 .68**      .37**      .29**      .32**      .42**     
                         [.67, .69] [.35, .39] [.27, .31] [.31, .34] [.40, .44]
                                                                               
  10. CT_9     4.46 1.64 .73**      .38**      .39**      .40**      .43**     
                         [.72, .74] [.36, .40] [.37, .40] [.38, .41] [.42, .45]
                                                                               
  6          7          8          9         
                                             
                                             
                                             
                                             
                                             
                                             
                                             
                                             
                                             
                                             
                                             
                                             
                                             
                                             
                                             
                                             
                                             
  .60**                                      
  [.58, .61]                                 
                                             
  .50**      .50**                           
  [.48, .51] [.49, .52]                      
                                             
  .44**      .45**      .47**                
  [.42, .45] [.43, .46] [.45, .49]           
                                             
  .49**      .46**      .50**      .40**     
  [.48, .51] [.45, .48] [.48, .51] [.38, .42]
                                             

Note. M and SD are used to represent mean and standard deviation, respectively.
Values in square brackets indicate the 95% confidence interval.
The confidence interval is a plausible range of population correlations 
that could have caused the sample correlation (Cumming, 2014).
 * indicates p < .05. ** indicates p < .01.
 
names_threat <- RA_task_df |> 
  dplyr::select(starts_with("Soci")) |> 
  colnames()

corr_mat <- RA_task_df |> 
  cor_test(vars = "tightness", vars2 = names_threat)
Warning: Using an external vector in selections was deprecated in tidyselect 1.1.0.
i Please use `all_of()` or `any_of()` instead.
  # Was:
  data %>% select(names_threat)

  # Now:
  data %>% select(all_of(names_threat))

See <https://tidyselect.r-lib.org/reference/faq-external-vector.html>.
corr_mat |> 
  arrange(desc(cor)) |> 
  mutate_if(is.numeric, ~round(., 4)) |> 
  write.csv("cor_table_jackson.csv", row.names = FALSE)

read_csv("cor_table_jackson1.csv")|> 
  mutate_if(is.numeric, ~round(., 3)) |> 
  gt() |> 
  tab_header(
    title = "Correlations between Cultural Tightness and Societal Threats",
  ) |> 
  cols_align(
    align = "center",
  ) |> 
  cols_align(
    align = "left",
    columns = "Societal Threat"
  )
Rows: 10 Columns: 5
-- Column specification --------------------------------------------------------
Delimiter: ","
chr (2): Societal Threat, p
dbl (3): Pearson Correlation, 95%CI Lower, 95%CI Higher

i Use `spec()` to retrieve the full column specification for this data.
i Specify the column types or set `show_col_types = FALSE` to quiet this message.
Correlations between Cultural Tightness and Societal Threats
Societal Threat Pearson Correlation p 95%CI Lower 95%CI Higher
Crime surge 0.200 < 0.001 0.176 0.217
Influx of refugees 0.180 < 0.001 0.159 0.200
Illegal immigration 0.170 < 0.001 0.151 0.192
Increase in national debt 0.160 < 0.001 0.142 0.183
Pollution 0.110 < 0.001 0.090 0.132
A surge of COVID-19 cases 0.110 < 0.001 0.086 0.127
Famine or drought 0.099 < 0.001 0.079 0.120
Natural disaster 0.094 < 0.001 0.073 0.115
Discrimination 0.078 < 0.001 0.057 0.098
Terrorism 0.051 < 0.001 0.030 0.072

4. A written summary of a regression model which summarizes the association between concern about a rise of COVID-19 cases and desire of cultural tightness. In this summary, state the association’s effect size and statistical significance.

simple_regression <- lm(SocietalThreat_8 ~ tightness, data = RA_task_df)

summary(simple_regression)

Call:
lm(formula = SocietalThreat_8 ~ tightness, data = RA_task_df)

Residuals:
    Min      1Q  Median      3Q     Max 
-3.3407 -0.9458  0.6593  0.9523  1.3473 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)  3.53802    0.05284  66.961   <2e-16 ***
tightness    0.11467    0.01149   9.984   <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 1.193 on 8686 degrees of freedom
Multiple R-squared:  0.01135,   Adjusted R-squared:  0.01123 
F-statistic: 99.68 on 1 and 8686 DF,  p-value: < 2.2e-16
library(lme4)
Loading required package: Matrix

Attaching package: 'Matrix'
The following objects are masked from 'package:tidyr':

    expand, pack, unpack
model_0 <- lmer(SocietalThreat_8 ~ 1 + (1 | country), REML = T, data = RA_task_df)

summary(model_0)
Linear mixed model fit by REML ['lmerMod']
Formula: SocietalThreat_8 ~ 1 + (1 | country)
   Data: RA_task_df

REML criterion at convergence: 27262.4

Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-3.1325 -0.5462  0.3317  0.7771  1.3575 

Random effects:
 Groups   Name        Variance Std.Dev.
 country  (Intercept) 0.1138   0.3374  
 Residual             1.3333   1.1547  
Number of obs: 8686, groups:  country, 38

Fixed effects:
            Estimate Std. Error t value
(Intercept)  4.04792    0.05658   71.55
0.1138/(0.1138+1.3333)
[1] 0.07864004
# 7.8% of the total variability in performance anxiety scores are attributable to differences among subjects.
model_1 <- lmer(SocietalThreat_8 ~ 1 + tightness + (1 + tightness | country), REML = T, data = RA_task_df)
Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, :
Model failed to converge with max|grad| = 0.00367363 (tol = 0.002, component 1)
summary(model_1)
Linear mixed model fit by REML ['lmerMod']
Formula: SocietalThreat_8 ~ 1 + tightness + (1 + tightness | country)
   Data: RA_task_df

REML criterion at convergence: 27216.2

Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-3.2668 -0.5319  0.3419  0.7528  1.7957 

Random effects:
 Groups   Name        Variance Std.Dev. Corr 
 country  (Intercept) 0.201093 0.44843       
          tightness   0.005558 0.07455  -0.69
 Residual             1.322041 1.14980       
Number of obs: 8686, groups:  country, 38

Fixed effects:
            Estimate Std. Error t value
(Intercept)  3.68482    0.09412  39.151
tightness    0.07910    0.01773   4.462

Correlation of Fixed Effects:
          (Intr)
tightness -0.814
optimizer (nloptwrap) convergence code: 0 (OK)
Model failed to converge with max|grad| = 0.00367363 (tol = 0.002, component 1)
(1.3333-1.322041)/1.3333
[1] 0.008444461
anova(model_1, model_0, test = "Chisq")
refitting model(s) with ML (instead of REML)
Data: RA_task_df
Models:
model_0: SocietalThreat_8 ~ 1 + (1 | country)
model_1: SocietalThreat_8 ~ 1 + tightness + (1 + tightness | country)
        npar   AIC   BIC logLik deviance  Chisq Df Pr(>Chisq)    
model_0    3 27265 27286 -13629    27259                         
model_1    6 27218 27260 -13603    27206 52.498  3  2.346e-11 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
library(mixedup)

extract_random_effects(model_1) |> 
  dplyr::filter(effect=="tightness") |> 
  ggplot(aes(reorder(group, -value), value, color = group)) +        # ggplot2 plot with confidence intervals
  geom_point() +
  geom_errorbar(aes(ymin = lower_2.5, ymax = upper_97.5)) +
  scale_color_viridis(option="viridis",discrete=T) +
  geom_hline(yintercept=0, linetype="dashed", 
                color = "red", size=0.5) +
  theme(legend.position = "none") +
  xlab("Country") + # for the x axis label
  ylab("Estimated Slopes") +
  labs(caption = "Estimated slopes by the random effects model with cultural tightness as level-1 predictor.") +
  theme(
  plot.caption = element_text(size = 10, hjust = 0.5)
) +
  coord_flip()
Warning: Using `size` aesthetic for lines was deprecated in ggplot2 3.4.0.
i Please use `linewidth` instead.

5. A figure which shows the association between COVID-19 concern and desire for cultural tightness (e.g., a scatterplot) for three different countries.

China <- RA_task_df |> 
  dplyr::filter(country == "China") |> 
  ggplot(aes(x=tightness, y=SocietalThreat_8)) +     # defines x and y axis variables
  
  # add observations to scatterplot
  geom_point(size=1.5, alpha=.5, colour="darkgrey", position = position_jitter(width=.2)) + # define size and colour
                                                    # alpha adds transparency
  # add fitted slope and 95% CIs
  geom_smooth(size=1,method=lm,colour="slateblue")+ # define size and colour
                                                    # method=lm indicates linear slope
  
  xlab("Culture Tightness") +                       # x-axis label
  ylab("COVID-19 Threat") +                       # y-axis
  theme_minimal() + # apply minimal theme to hide many unnecessary features of plot
  theme(            # make modifications to the theme
    panel.grid.major.y=element_line(),   # show major grid for y axis
    panel.grid.minor.y=element_line(),   # show minor grid for y axis
    panel.grid.major.x=element_line(),   # show major grid for x axis
    panel.grid.minor.x=element_line(),   # show minor grid for x axis
    text=element_text(size=14),          # font aesthetics 
    axis.text=element_text(size=12),
    axis.title=element_text(size=14,face="bold")) +
  scale_x_continuous(breaks=seq(0,7,1)) +  # x-axis tick marks
  scale_y_continuous(breaks=seq(0,5,1)) + 
  labs(caption = "The relationship between cultural tightness and perceived COVID-19 threat in China, 
  including a 95% confidence band of the slope, 
       with histograms of the variable on each axis in the opposite margins.") +
  theme(
  plot.caption = element_text(size = 10, hjust = 0.5)
)

ggMarginal(China, type="histogram",     # add histograms to marginal plot
                   fill = "lightgrey",     # color of histograms
                   xparams = list(bins=8),# n of bins for x variable
                   yparams = list(bins=6)) 
`geom_smooth()` using formula = 'y ~ x'
`geom_smooth()` using formula = 'y ~ x'

Canada <- RA_task_df |> 
  dplyr::filter(country == "Canada") |> 
  ggplot(aes(x=tightness, y=SocietalThreat_8)) +     # defines x and y axis variables
  
  # add observations to scatterplot
  geom_point(size=1.5, alpha=.5, colour="darkgrey", position = position_jitter(width=.2)) + # define size and colour
                                                    # alpha adds transparency
  # add fitted slope and 95% CIs
  geom_smooth(size=1,method=lm,colour="slateblue")+ # define size and colour
                                                    # method=lm indicates linear slope
  
  xlab("Culture Tightness") +                       # x-axis label
  ylab("COVID-19 Threat") +                       # y-axis
  theme_minimal() + # apply minimal theme to hide many unnecessary features of plot
  theme(            # make modifications to the theme
    panel.grid.major.y=element_line(),   # show major grid for y axis
    panel.grid.minor.y=element_line(),   # show minor grid for y axis
    panel.grid.major.x=element_line(),   # show major grid for x axis
    panel.grid.minor.x=element_line(),   # show minor grid for x axis
    text=element_text(size=14),          # font aesthetics 
    axis.text=element_text(size=12),
    axis.title=element_text(size=14,face="bold")) +
  scale_x_continuous(breaks=seq(0,7,1)) +  # x-axis tick marks
  scale_y_continuous(breaks=seq(0,5,1)) + 
  labs(caption = "The relationship between cultural tightness and perceived COVID-19 threat in Canada, 
  including a 95% confidence band of the slope, 
       with histograms of the variable on each axis in the opposite margins.") +
  theme(
  plot.caption = element_text(size = 10, hjust = 0.5)
)

ggMarginal(Canada, type="histogram",     # add histograms to marginal plot
                   fill = "lightgrey",     # color of histograms
                   xparams = list(bins=8),# n of bins for x variable
                   yparams = list(bins=6)) 
`geom_smooth()` using formula = 'y ~ x'
`geom_smooth()` using formula = 'y ~ x'

USA <- RA_task_df |> 
  dplyr::filter(country == "USA") |> 
  ggplot(aes(x=tightness, y=SocietalThreat_8)) +     # defines x and y axis variables
  
  # add observations to scatterplot
  geom_point(size=1.5, alpha=.5, colour="darkgrey", position = position_jitter(width=.2)) + # define size and colour
                                                    # alpha adds transparency
  # add fitted slope and 95% CIs
  geom_smooth(size=1,method=lm,colour="slateblue")+ # define size and colour
                                                    # method=lm indicates linear slope
  
  xlab("Culture Tightness") +                       # x-axis label
  ylab("COVID-19 Threat") +                       # y-axis
  theme_minimal() + # apply minimal theme to hide many unnecessary features of plot
  theme(            # make modifications to the theme
    panel.grid.major.y=element_line(),   # show major grid for y axis
    panel.grid.minor.y=element_line(),   # show minor grid for y axis
    panel.grid.major.x=element_line(),   # show major grid for x axis
    panel.grid.minor.x=element_line(),   # show minor grid for x axis
    text=element_text(size=14),          # font aesthetics 
    axis.text=element_text(size=12),
    axis.title=element_text(size=14,face="bold")) +
  scale_x_continuous(breaks=seq(0,7,1)) +  # x-axis tick marks
  scale_y_continuous(breaks=seq(0,5,1)) + 
  labs(caption = "The relationship between cultural tightness and perceived COVID-19 threat in the USA, 
  including a 95% confidence band of the slope, 
       with histograms of the variable on each axis in the opposite margins.") +
  theme(
  plot.caption = element_text(size = 10, hjust = 0.5)
)

ggMarginal(USA, type="histogram",     # add histograms to marginal plot
                   fill = "lightgrey",     # color of histograms
                   xparams = list(bins=8),# n of bins for x variable
                   yparams = list(bins=6)) 
`geom_smooth()` using formula = 'y ~ x'
`geom_smooth()` using formula = 'y ~ x'