Book Review
Grimmer, J., Roberts, M.E., & Stewart, B.M. (2022). Text as Data: A New Framework for Machine Learning and the Social Sciences. Princeton University Press.
DOI:
https://doi.org/10.17356/ieejsp.v10i4.1410Keywords:
computational methods, machine learning, text anyalysis, social sciences, textual dataAbstract
Social scientists, digital humanities scholars and industry professionals now regularly leverage large-scale document corpora. A large dataset of texts, while providing a wealth of information, is insufficient on its own to generate meaningful insights. It is essential to approach the dataset with well-defined research questions that guide the analytical process and ensure the relevance of the findings. Moreover, deriving meaningful answers requires the application of appropriate methodologies that are aligned with the research objectives. In addition to methodological rigor, scholars must critically assess the limitations of the dataset's validity. This involves evaluating the accuracy, reliability, and completeness of the data, as well as recognizing any inherent biases.
The book book aims to illustrate how to treat “text as data” for social science tasks and social science problems. It adopts a six-part structure, combined with several chapters and subchapters. Each part is structured around five fundamental concepts: representation, discovery, measurement, prediction, and causal inference. By doing this, it serves as a comprehensive guide for researchers, delineating the capabilities and limitations inherent in text data methodologies.
Downloads
Published
How to Cite
Issue
Section
License
Copyright Notice
Authors who publish with this journal agree to the following terms:
Authors retain copyright and grant the journal right of first publication, with the work three months after publication simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal. This acknowledgement is not automatic, it should be asked from the editors and can usually be obtained one year after its first publication in the journal.