Sources of data
SOCC¶
Kolhatkar, V., H. Wu, L. Cavasso, E. Francis, K. Shukla and M. Taboada (2018) The SFU Opinion and Comments Corpus: A corpus for the analysis of online news comments. Simon Fraser University.
Paper describing the corpus: Kolhatkar, V., H. Wu, L. Cavasso, E. Francis, K. Shukla and M. Taboada (2020) The SFU Opinion and Comments Corpus: A corpus for the analysis of online news comments. Corpus Pragmatics 4: 155–190.
GitHub page, with links to download the corpus: https://
github .com /sfu -discourse -lab /SOCC
SFU Review Corpus¶
Project Gutenberg¶
Any book from Project Gutenberg, which distributes books that are out of copyright.
Democracy checkup¶
For survey data, we use the Democracy Checkup distributed by Odesi, a Canadian consortium that holds social science data. This is a survey of Canadian attitudes about democratic values, public policies, and current issues:
Harell, Allison; Stephenson, B. Laura; Rubenson, Daniel; Loewen, Peter John, 2023, “Democracy Checkup, 2022. Canada”, Harell et al. (2023), Borealis, V1, UNF:6:ufqbMikbXcaHqVhbaEXR3w== (fileUNF)
- Harell, A., Stephenson, B. L., Rubenson, D., & Loewen, P. J. (2023). Democracy Checkup, 2022 [Canada]. In Democracy Checkup. Borealis. 10.5683/SP3/TEKM3T