Book Review

Grimmer, J., Roberts, M.E., & Stewart, B.M. (2022). Text as Data: A New Framework for Machine Learning and the Social Sciences. Princeton University Press.

Authors

  • Tamás Varga Eötvös Loránd University

DOI:

https://doi.org/10.17356/ieejsp.v10i4.1410
Abstract Views: 29 PDF Downloads: 20

Keywords:

computational methods, machine learning, text anyalysis, social sciences, textual data

Abstract

Social scientists, digital humanities scholars and industry professionals now regularly leverage large-scale document corpora. A large dataset of texts, while providing a wealth of information, is insufficient on its own to generate meaningful insights. It is essential to approach the dataset with well-defined research questions that guide the analytical process and ensure the relevance of the findings. Moreover, deriving meaningful answers requires the application of appropriate methodologies that are aligned with the research objectives. In addition to methodological rigor, scholars must critically assess the limitations of the dataset's validity. This involves evaluating the accuracy, reliability, and completeness of the data, as well as recognizing any inherent biases. 

The book book aims to illustrate how to treat “text as data” for social science tasks and social science problems. It adopts a six-part structure, combined with several chapters and subchapters. Each part is structured around five fundamental concepts: representation, discovery, measurement, prediction, and causal inference. By doing this, it serves as a comprehensive guide for researchers, delineating the capabilities and limitations inherent in text data methodologies. 

 

Downloads

Published

2025-02-17

How to Cite

[1]
Varga, T. 2025. Book Review: Grimmer, J., Roberts, M.E., & Stewart, B.M. (2022). Text as Data: A New Framework for Machine Learning and the Social Sciences. Princeton University Press. Intersections. East European Journal of Society and Politics. 10, 4 (Feb. 2025), 160–165. DOI:https://doi.org/10.17356/ieejsp.v10i4.1410.