Part I. Introduction to Data Science and Interactive Visualization Tools for the Analysis of Qualitative Evidence1. Truly Equal-Status Mixed Methods Design (TESM2D)
1.Qual Versus Quant is no longer?
2.Truly Equal-Status Mixed Methods Design (TESM2D)
3.Relevance of data science and interactive visualizations in the birth of TESM2D
1.How does data science and interactive visualizations can be used to synthesize qualitative evidence?
2.Relevance of No-code or low-code analytic approaches for TESM2D
4.TESM2D and its Connection with Data Science Democratization
2. Textual and Relational data (TRD)
1.Textual and Relational data (TRD) Almost Overwhelming Availability
2.Importance of data science techniques to make sense of TRD
3.Integrating Spatial Contextualization in the Study of TRD
3. Digital Ethnography, Data Science, and Ethical Considerations
1.Digital Ethnography and Data Science for Qualitative Evidence Analyses
2.What Ethical Considerations should be considered when applying data science tools for the analysis of TRD?
- Guidelines for informed consent
- Repercussions for Institutional Revie Boards
- Protecting Privacy and Confidentiality
4. Bool Plan and Organization
1.Approaches to analyzing TRD
2.Software Introduction for Tools to analyze TRD
3.Description of Data Sources for Replication and Reproducibility Purposes
4.How to Read this Book and its Standalone Chapters?
5.Link Between Network Modeling and Relational Data
6.Link Between Text Classification and Textual Data
7.Integration of Spatial Context To Data Storytelling
Part II. Network modeling frameworksThis suite of frameworks relies on network analyses methods to retrieve the mathematical structure embedded in qualitative and relational data. The three frameworks will provide researchers with dynamic and/or interactive visualizations highlighting central topics and actors, as well as software tools designed to highlight the processes and contexts wherein qualitative evidence emerged, including hypotheses tests via Monte Carlo simulations.
5. Network Analysis of Qualitative Data (NAQD)
NAQD analyzes the mathematical structure contained in participants’ coded responses (labels assigned to their transcribed responses), identifies influential actors and coded responses via centrality measures, and identifies similar concerns based on group ascription or participants’ roles (via quadratic assignment correlations). Additionally, this procedure identifies participants’ network communities given their shared responses. Outputs include fully interactive HTML visualizations, downloadable databases with community ascription, and distributions of Monte Carlo simulations of likelihood quantiles.
6. Graphical Retrieval and Analysis of Temporal Information Systems (GRATIS)
GRATIS analyzes the chronological/temporal evolution of information provided by research participants or retrieved from document analyses/essays in qualitative studies. GRATIS offers the possibility of observing the simultaneous evolution of information as retrieved across all research participants, even when data gathering, or data collection, did not happen at the same time or in the same space. This analysis is achieved via global time stamps. This framework also identifies the relevance of actors and coded responses and renders dynamic, fully interactive HTML visualizations.
7. Visual Evolution, Replay, and Integration of Temporal Analytic Systems (VERITAS)
VERITAS analyzes the evolution of information in focus group interactions occurring in the same space (including virtual spaces such as video calls) and in real time. Analyses are separated into (a) evolution of message or information exchanges among participants (actor to actor) and (b) evolution of coded responses over time (actor to coded responses). This framework renders dynamic and interactive HTML visualizations.
8. Relational Frameworks for Data Mining and Data Retrieval via Co-authorship Networks (CN)
This chapter offers a framework to model relationships among units, which may be used as a map to detect influential and peripheral players. Specifically, we offer an example that may take data from SCOPUS to detect the most influential co-authors in the topic “ChatGPT.” This network may offer systematic and innovative approaches to conducted literature reviews, efficiently by identifying central authors who may be collaborating among different communities of thought, as realized by their publication records. Co-authorship networks (CN) may also be integrated with the results of Machine Driven Literature Classification (MDLC)
Part III. Machine Driven Text Classification and Statistical Modeling frameworksThis second suite of frameworks and software tools offers qualitative and mixed methods researchers cutting-edge, state-of-the-art methods to synthetize qualitative evidence in three main areas: code or label identification of written or transcribed data typically employed in qualitative studies; classification of document analyses, with main applications (but not limited) to systematic literature reviews; and the closing of open-ended questions in survey research, which, despite allowing survey respondents to express their views in their own words and providing information about processes or reasons typically absent in quantitative studies, may become difficult or impossible to analyze manually given time and/or financial constraints.
9. Latent Code Identification (LACOID)
LACOID identifies latent codes (topics) in qualitatively gathered data such as interviews, focus groups, essays, media posts, ethnographic observations. In addition to all identified latent codes along with the original texts, LACOID outputs include dynamic HTML visualizations of each latent code and their constituting words frequencies
10. Machine Driven Classification of Open-ended Responses (MDCOR)
MDCOR closes open-ended questions in surveys by providing procedures for detecting themes and assessing the optimal number of latent topics in thousands of survey responses. Outputs include access to the original text of open-ended classified responses and HTML summaries of each topic.
11. Machine Driven Literature Classification (MDLC)
MDLC identifies latent topics or themes in document analyses, including (but not limited to) systematic literature reviews. This framework allows for the assessment of the optimal number of topics in a set of documents and/or research articles. Outputs will allow access to the classified list of documents and latent codes along with dynamic HTML summaries.
Part IV. Integration of Network and Text Classification Analyses12. In what instances should or could we integrate the analyses and frameworks described in parts I and II?
1.Pros and Cons
2.Best practices
13. Incorporating Spatial Context for Data StoryTelling: GeoStoryTelling
1.Data storytelling and the Academic Research Process
2.Digital Ethnography and Geographical Information Systems (GIS)
3.Multimedia tools to share stories
4.GIS and Data StoryTelling: GeoStoryTelling
14. Sentiment Network Modeling
1.Descriptive modeling tools for Text Analysis
2.Sentiment Analysis
3.Integrating Sentiment Analysis with Relational Thinking
4.Integrating Sentiment Analysis with Classified Topics used in MDCOR, LACOID, and MDLC
15. Closing thoughts and future work