Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

Python for text analysis

python for text analysis

This book is an introduction to python through practical applications in text analysis, especially for the humanities and social sciences. It is part of a course at Simon Fraser University:

The book can be read by itself, but it will only make sense together with the course materials under SFU’s learning management system, for those enrolled in the course.

Course objectives

The course introduces basic concepts and tools for text analysis using the python programming language. It will address data capture and manipulation, data cleaning and preprocessing, and text analysis for linguistics and other social sciences.

At the end of the course, students will have learnt the basic aspects of python programming. They will understand how to process language data for various analyses.

More specifically, students will:

  • Learn core concepts of programming (variables, functions, objects)

  • Learn to install and use basic packages for text analysis (NLTK, spacy)

  • Be able to collect and store a dataset using existing python packages

  • Clean and normalize language data

  • Perform natural language processing analysis on language data

Made available under a Creative Commons CC BY-NC-SA 4.0 License, Attribution-NonCommercial-ShareAlike 4.0.

  • BY: credit must be given to the creator.

  • NC: Only noncommercial uses of the work are permitted.

  • SA: Adaptations must be shared under the same terms.

Acknowledgements

  • Some of the units contain code and ideas from other sources, referenced there

  • Logo by Greg Holoboff, CEE at SFU

  • Built with Jupyter Book

Suggested citation

Taboada, Maite (2025) Python for text analysis. Version 1. https://maitetaboada.github.io/python_text_analysis