Treffer: Harvesting Twitter Data for Studying Motor Behavior in Disabled Populations: An Introduction and Tutorial in Python

Title:
Harvesting Twitter Data for Studying Motor Behavior in Disabled Populations: An Introduction and Tutorial in Python
Language:
English
Authors:
Nicholas E. Fears (ORCID 0000-0001-7081-0015), Riya Chatterjee, Priscila M. Tamplain (ORCID 0000-0003-2713-5733), Haylie L. Miller (ORCID 0000-0003-4372-1206)
Source:
Journal of Motor Learning and Development. 2023 11(3):555-570.
Availability:
Human Kinetics, Inc. 1607 North Market Street, Champaign, IL 61820. Tel: 800-474-4457; Fax: 217-351-1549; e-mail: info@hkusa.com; Web site: https://journals.humankinetics.com/view/journals/jmld/jmld-overview.xml
Peer Reviewed:
Y
Page Count:
16
Publication Date:
2023
Sponsoring Agency:
National Institute of Mental Health (NIMH) (DHHS/NIH)
Contract Number:
K01MH107774
Document Type:
Fachzeitschrift Journal Articles<br />Reports - Research
DOI:
10.1123/jmld.2023-0006
ISSN:
2325-3193
2325-3215
Entry Date:
2024
Accession Number:
EJ1411546
Database:
ERIC

Weitere Informationen

Social media platforms are rich and dynamic spaces where individuals communicate on a person-to-person level and to broader audiences. These platforms provide a wealth of publicly available data that can shed light on the lived experiences of people from numerous clinical populations. Twitter can be used to examine individual expressions and community discussions about specific characteristics (e.g., motor skills, burnout) associated with a diagnostic group. These data are useful for understanding the perspectives of a diverse, international group of self-advocates representing a wide range of clinical populations. Here, we provide a framework for how to harvest data from Twitter through their free, academic researcher application programming interface access using Python, a free, open-source programming language. We also provide a sample data set harvested using this framework and a set of analyses on these data specifically related to motor differences in neurodevelopmental conditions. This framework offers a cost-effective and flexible means of harvesting and analyzing Twitter data. Researchers should utilize these resources to advance our understanding of the lived experiences of clinical populations through social media platforms and to determine the critical questions that are of most importance to improving quality of life.

As Provided

AN0174081801;[fqbl]01dec.23;2023Dec11.06:00;v2.2.500

Harvesting Twitter Data for Studying Motor Behavior in Disabled Populations: An Introduction and Tutorial in Python 

Social media platforms are rich and dynamic spaces where individuals communicate on a person-to-person level and to broader audiences. These platforms provide a wealth of publicly available data that can shed light on the lived experiences of people from numerous clinical populations. Twitter can be used to examine individual expressions and community discussions about specific characteristics (e.g., motor skills, burnout) associated with a diagnostic group. These data are useful for understanding the perspectives of a diverse, international group of self-advocates representing a wide range of clinical populations. Here, we provide a framework for how to harvest data from Twitter through their free, academic researcher application programming interface access using Python, a free, open-source programming language. We also provide a sample data set harvested using this framework and a set of analyses on these data specifically related to motor differences in neurodevelopmental conditions. This framework offers a cost-effective and flexible means of harvesting and analyzing Twitter data. Researchers should utilize these resources to advance our understanding of the lived experiences of clinical populations through social media platforms and to determine the critical questions that are of most importance to improving quality of life.

Keywords: social media; autism; developmental coordination disorder; dyspraxia

Technology and, more specifically, social media use are ubiquitous across daily life in the 21st century and have changed how many people communicate. Social media platforms provide near-instant and direct interaction between individuals as well as between an individual and an organization ([23]). In particular, Twitter provides opportunities for public interaction between individuals and has seen rapid uptake and growth in some English-speaking countries, such as the United States and Australia ([7]). Although many academics use Twitter for networking, publication announcements, and scientific critiques ([31]), Twitter also provides a new and unique opportunity for health researchers interested in a wide range of topics such as hearing loss ([11]), health care workers' compassion ([9]), obesity ([17]), and COVID-19 ([14]). Health researchers have used Twitter to monitor public sentiment and discourse around a topic (e.g., body image), to surveil prevalence of discussions about specific topics (e.g., influenza), to analyze public response to organizational engagement (e.g., Center for Disease Control's public announcements), and to conduct network analyses among members of a clinical population (e.g., relationships between cancer patients; [28]). One potential barrier to the expanded use of Twitter for health research specific to motor behavior research is the lack of an established best-practice for the process of harvesting and analyzing Twitter data. Here, we provide an overview of our recommended approach as a starting point for development of best-practice guidelines and a uniform pipeline that can be adapted to address a variety of research questions related to motor behavior.

To illustrate how Twitter can be used to study motor behavior, we will focus on two primary communities: autistic[1] individuals and those diagnosed with developmental coordination disorder (DCD)/dyspraxia. The methods described here are applied to these two populations in independent studies reported in this special section (Chatterjee et al., under review in this Journal of Motor Learning and Development [JMLD] special section; Tamplain et al., under review in this JMLD special section) as examples. However, there are many other populations whose Twitter data could provide valuable context for the role of motor differences in identity and daily living, including cerebral palsy, attention-related conditions, dyslexia, dysgraphia, Ehlers-Danlos syndrome, intellectual disability, Down syndrome, Fragile X syndrome, and others. While there is a significant amount of work on social media discourse and identity among autistic adults, there are no studies of this nature specific to DCD/dyspraxia, nor any specifically on how autistic or DCD/dyspraxic individuals' motor behavior is represented on social media. As such, the theoretical framework for our approach centers around prior studies of the autistic and physically disabled Twitter communities, which we predict will parallel the DCD/dyspraxic community, and other neurodivergent, and disabled communities in many ways.

Overall, self-reported social media use among autistic adults is high, with nearly 80% of autistic adults using social media for an average of 3 hr per day for 5 days per week ([19]). Autistic adults use social media for a wide range of reasons, including social connection, entertainment and information, business purposes, and maintaining contact with family ([19]). Specifically, Twitter is one of the most popular social media platforms used by autistic adults ([30]). Previous autism research using Twitter data examined topics such as discourse (e.g., appropriateness of person-first vs. identity-first language; [2]; [3]; [12]; [27]), usage ([30]), burnout ([18]), and social support ([25]) across multiple groups (e.g., any user, autistic individuals, significant others of autistic individuals, advocacy organizations). Examining content from all users on Twitter, Beykikhoshk et al. ([3]) found that discourse tended to focus on topics relating to children and boys, rather than adults or girls. Further content analysis of tweets about autism showed that discussion primarily centered around autism awareness and information about autism ([12]) or empowerment and support of autistic individuals ([2]), but tweets about autistic individuals' behaviors and their health are also highly prevalent ([12]).

Using Twitter to examine autistic individuals' own accounts of their experiences is an excellent way to identify common themes related to particular topics or behaviors. For example, Mantzalas et al. ([18]) examined autistic individuals' experiences of autistic burnout using data from Twitter and other social media sites. Examining 17,923 tweets and 172 forum posts from a popular autistic-run web community, Wrong Planet, they found that autistic burnout was a chronic condition with a direct impact on the health and well-being of autistic individuals. Autistic individuals discussed the lack of knowledge among clinicians regarding autistic burnout, and also agreed that masking—a technique autistic individuals use to pass as neurotypical—was a leading risk factor for inducing burnout. This study showed that Twitter can provide useful data regarding the specific health-related conditions and lived experiences of autistic individuals.

Although autism has been associated with social communication differences and restricted, repetitive patterns of behaviors, interests, or activities (e.g., echolalia, difficulties with transitions; [1]), more than 90% of autistic individuals also experience significant motor differences ([21]). These motor differences are strongly associated with the development of functional behavior in autistic individuals (e.g., play, self-care, handwriting, gesturing; [4]; [13]). Furthermore, researchers have begun to examine similarities and differences in motor differences between autism and DCD/dyspraxia ([8]; [16]; [20], [21]). To extend this research and better understand the role of motor differences in autistic individuals' identities, the workflow provided here and the two related studies in this special section (Chatterjee et al., under review in this JMLD special section; Tamplain et al., under review in this JMLD special section) targeted terms related to autism (e.g., Autism Spectrum Disorder, autistic) and DCD/dyspraxia (e.g., DCD, dyspraxic) found in users' bios and tweets, yielding a data set that could be analyzed for motor-related content.

Self-advocacy among the broader disabled community is also evident on social media, where identity-related discourse can provide an avenue for knowledge acquisition and sharing of lived experiences and resources ([29]). Disability activism on Twitter has also challenged ableism through movements such as #EverydayAbleism, #CripTheVote, #DisabledAndCute, #Disabodiposi, and #MeCripple. Tweets using the #MeCripple hashtag described disabled users' experiences with ableism and discrimination in daily living, raising awareness in the nondisabled community and providing a platform for social support ([22]). Body positivity discourse is also present within the disabled community on social media platforms, including Twitter, where the #DisabledAndCute movement encouraged users to share selfies and lived experiences of the intersection between their physical and psychosocial identities. Intellectually disabled adolescents and adults also use social media for similar reasons and report positive experiences obtaining social support for decision making and confiding predominantly via Facebook, Twitter, and Skype ([32]).

We speculate that DCD/dyspraxic users access Twitter for reasons similar to those affirmed by the autistic and disabled communities: social connection, identity exploration and affirmation, activism, resource sharing, and knowledge acquisition. Low awareness of DCD among key stakeholders ([15]) and low diagnosed prevalence rates may lead to a dearth of opportunities for support or difficulty finding other DCD/dyspraxic peers in one's local community. By contrast, social media offers a broader community, and Twitter hashtags provide a means of easily identifying oneself, other self-advocates, and topics of particular interest.

Disabled people's online identities tend to reliably reflect their actual identity characteristics ([24]), and the community of neurodivergent and disabled self-advocates on Twitter is diverse, active, and continually growing. Twitter discourse has been demonstrated to represent global perspectives on disability culture ([26]). As such, Twitter data is an attractive choice for researchers seeking firsthand accounts of lived experiences of motor differences from self-advocates across a wide range of populations and backgrounds. No previous studies have examined autistic or DCD/dyspraxic people's social media representations of their identity and lived experiences specifically in the context of motor behavior. To our knowledge, this special section also contains the first reports of data obtained from the naturalistic discourse of DCD/dyspraxic self-advocates on Twitter or any similar social media platform. In this paper, we use research questions related to the manifestation of motor differences in autism and DCD/dyspraxia as an example to demonstrate our recommended approach to harvesting and analyzing Twitter data to understand the lived experiences of individuals with developmental disabilities. Expanded description of our application of these methods to harvesting and analysis of data from members of the #ActuallyAutistic and #DCD/dyspraxia communities can also be found in this special section (Chatterjee et al., under review in this JMLD special section; Tamplain et al., under review in this JMLD special section).

Current Study

The goal of the current paper is to provide a foundation for development of best-practice guidelines and a uniform pipeline for harvesting and analyzing Twitter data that can be adapted to address a variety of research questions related to motor behavior. Here, we focus on tweets from autistic individuals and individuals with DCD/dyspraxia. First, we describe the process of accessing and harvesting data from Twitter using the Twitter application programming interface (API; version 2.0; https://developer.twitter.com/en/products/twitter-api) and Python (version 3.8.10; https://www.python.org/downloads/release/python-3810/; Figure 1). Second, we present the results of an example search to demonstrate the application of our pipeline, including number of tweets harvested, number of individual users, and a keyword analysis. Finally, we discuss the potential utility of our method for obtaining data directly from a diverse, international sample of self-advocates.

Graph: Figure 1 —Twitter application programming interface and Python programming workflow. API = application programming interface.

Method

Setting Up a Twitter Researcher Account

Twitter provides real-time and archival access to Twitter data for business, nonprofit, educational, and research purposes through their Twitter Developer Platform. To use the Twitter Developer Platform (https://developer.twitter.com/en), users need to create an account. This can either be an existing Twitter account (i.e., a personal account) or a new account specifically created for the purpose of accessing the Developer Platform. There are several different levels of access to back-end Twitter data via the Developer Platform, one of which is Academic Research access. The Academic Research access level is designed to support the academic research community by providing free access to real-time and full-archival search capabilities.

To receive Academic Research access, users must complete an application process in which Twitter reviews the applicant's academic credentials, research project goals, research methods, and dissemination plan. Graduate students, doctoral candidates, postdoctoral fellows, faculty, or research-focused employees at academic institutions or universities are eligible for Academic Research access. For their academic profile within the application (separate from their Twitter user profile), applicants must list their name as it appears in the institution's documentation; a webpage to establish their identity; the applicant's university name and location; the application's department, school, or lab name; applicant's field of study; and their current role. For their research project details, the applicant must provide information about the proposed research project including research questions, funding, types of analyses, and plans for dissemination. In our team's experience, approval of an Academic Research access request took fewer than five business days. For more information, see https://developer.twitter.com/en/products/twitter-api/academic-research/application-info. Other access types with varied associated costs are also available for those wishing to access the API for nonacademic research purposes, but these are outside the scope of this paper.

Twitter API (Version 2.0)

For accessing the data, Twitter provides an API currently in its second version. The Twitter API in combination with the Academic Research access provides searchable access to all public tweets within the last 7 days (real time) as well as Twitter's complete public archival database dating back to March 2006. There are numerous ways to access the Twitter API, which can found on the Twitter Developer Platform website. Here, we will focus on accessing the API using Python, a high-level, general purpose programming language. While other options are available for accessing the API, Python is a relatively straightforward language with an easy learning curve for users with limited prior coding experience. For this reason, we recommend Python as the primary language for accessing the API in our workflow, unless your group has a specific need for a different approach or regularly uses a different language (e.g., R, Ruby, JavaScript, C#). We will demonstrate how to engage the Twitter API by connecting to the API endpoint (i.e., web address), verify authorization, set your search parameters, harvest the data from the Twitter archive, and store it in a comma-separated file (CSV) for further analysis.

Utilizing the API With Python

Setting Up Your Python Environment

The first step in developing your Python code for accessing the Twitter API is to set up your Python environment. This may be completed via an integrated development environment (IDE; e.g., IDLE, Spyder, PyCharm, Jupyter) or in the command line. We recommend that novice users use an IDE, which is a tool that assists with writing, debugging, and running your code by highlighting errors and suggesting or autocompleting functions. For this project, the standalone Spyder IDE 5.3.3 (https://www.spyder-ide.org/) with Python 3.8.10 64 bit was used on a Windows 11 operating system. We chose the Spyder IDE because it is a free, open-source, cross-platform (i.e., can be run on Windows, Mac, Linux) environment with integrated libraries commonly used for data science (e.g., NumPy, pandas). Spyder was specifically designed by and for scientists, and it has substantial documentation and a robust online community available for support. For this reason, it is a good choice for users with limited prior experience working with Python in an IDE. To set up your environment, first install and import the required packages. The following packages are required for this project: requests, os, csv, dateutil.parser, and time (Figure 2).

Graph: Figure 2 —List of previously installed Python packages to import. Comments indicate what each package will be used for in this project.

Defining Necessary Functions

The code in this section is adapted from Andrew Edward's guide published on Towards Data Science (https://towardsdatascience.com/an-extensive-guide-to-collecting-tweets-from-twitter-api-v2-for-academic-research-using-python-3-518fcb71df2a) and Twitter's Github repository (https://github.com/twitterdev/Twitter-API-v2-sample-code). We provide a walkthrough of the code as we used it, along with brief descriptions of other potential uses.

A series of functions must first be defined to connect with the API and write the harvested data to a CSV file. These functions are foundational for the method we describe in this paper and would need minimal editing to conduct a similar search for a different project. The first function, auth(), returns the Bearer Token necessary to authenticate access to the Twitter API endpoint (Figure 2). The second function, create_headers(bearer_token), takes the output from the auth() function, bearer_token, and formats it correctly to be read by the Twitter API endpoint (Figure 3). The third function, create_url(keyword, start_date, end_date, max_results = 500), builds the data request by defining the Twitter API endpoint to connect to (e.g., https://api.twitter.com/2/tweets/search/all for all archival tweets) and the query parameters (e.g., keywords, start date, end date; Figure 3). The fourth function, connect_to_endpoint(), takes the information from the previous three functions and sends the request to the Twitter endpoint and returns the response (Figure 3). The fifth and final function needed for this project is the append_to_csv(jsson_response, filename) function. This function appends the data harvested by connect_to_endpoint() to a CSV file (Figure 4).

Graph: Figure 3 —Defining four functions for creating and sending a request to the Twitter API endpoint and returning the requested data. API = application programming interface.

Graph: Figure 4 —Function to append harvested Twitter data to a comma-separated file.

Specifying Access and File Parameters

After defining the necessary functions, the parameters for the search and subsequent files must be specified. These parameters will be unique to a given user and project. The first parameter that we specified was the Bearer Token necessary for authorizing the search in the Twitter API endpoint. The Bearer Token is a 113-character string of letters, numbers, and symbols that is generated in a user's Academic Research Project in the Twitter Developer Portal. This token is unique to a user and their specific project and should not be shared outside of your research team (i.e., do not publish it publicly on GitHub or in a paper). Next, we specified the input parameters for our request. This includes: (a) the Bearer Token using the auth (a) function defined above, (b) the headers using the create_headers() function defined above, (c) our list of keywords, and (d) the start date (e.g., October 1st, 2022) and end date (e.g., October 2nd, 2022) for our search. Next, we defined our file name parameter, filename, which provides the name to be used for the CSV file that will store the data harvested from Twitter. Finally, we added the column names to our CSV file to organize the harvested data. It is important to align the column names with the data structure needed for your analyses and to maintain appropriate mapping between these column names and the locations in which you store data from individual tweets.

Specifying Parameters for a Search

The search parameters that are specified are highly project specific and adjustable based on project needs. The Twitter API will recognize multiple types of search parameters for harvesting data, such as standalone parameters (e.g., keywords), conjunction parameters (e.g., is:retweets), and Boolean operators (e.g., OR). More information on building a query can be found here: https://developer.twitter.com/en/docs/twitter-api/tweets/search/integrate/build-a-query. Parameters for the data you want to harvest about a tweet must be specified using the specific variable names of Twitter fields. Twitter fields refer to information about different objects (e.g., tweets, users) that will be returned by your search (e.g., tweet.fields: text). More information about specifying field parameters can be found here: https://developer.twitter.com/en/docs/twitter-api/fields.

Harvesting Multiple Pages of Twitter Data

The Twitter API returns data in "pages," where each page contains approximately 500 results (e.g., tweets) for a single search. To harvest data from multiple pages, the Twitter API uses a page token system to indicate what page of data needs to be harvested. This is labeled as the "next token" in the metadata of a search response. This next token needs to be retrieved and stored during a search for use in the next iteration of the search (e.g., next_token = json_response["meta"]["next_token"]). For the current project, we utilized a while loop that continued to iterate through the pages while a next token was present. One caveat to this approach is that the Twitter API limits the rate at which a user can submit search requests to the API. For the search we present here to illustrate our recommended pipeline, the rate limit was one request each second and a maximum of 300 requests per 15-min window. More information on Twitter API rate limits can be found here: https://developer.twitter.com/en/docs/twitter-api/rate-limits. In practice, this rate limit required that our project pause for 1 s after each request and then pause for 601 s after every 300 requests to avoid being rate limited and added to Twitter's denylist, a list of accounts that are permanently blocked from accessing the API (Figure 5).

Graph: Figure 5 —A while loop can be used to harvest data using iterative searches without being rate limited.

Example Twitter Search and Data Analysis

To demonstrate how to use our recommended pipeline for accessing the Twitter API to extract data related to motor behavior, we harvested all tweets written in English between October 1st and October 2nd 2022 containing terms associated with DCD (i.e., dyspraxia, dyspraxic, DCD) or autism (i.e., autistic, actuallyautistic, autism, autie, autist, aspie, asperger) from archival Twitter data. Retweets were excluded from the search. The tweets were then analyzed to determine the number of unique users using these terms in tweets, the number of users with these terms in their account bio information, and the degree of tweet keyword overlap using a custom Python script. Accounts were labeled as autistic only, DCD/dyspraxic only, or autistic and DCD/dyspraxic if they contained autistic-associated terms (i.e., autistic, autism, autist, autie, asperger, aspie, ASD), DCD-associated terms (i.e., dyspraxia, dyspraxic, DCD, clumsy), or both. Individual tweets were examined for keyword overlap between these two sets of terms.

Results

Number of Tweets Returned

The initial data harvesting for tweets using terms associated with autism and DCD/dyspraxia returned 11,277 unique tweets, not including retweets. These tweets had 156,653 likes, 16,198 retweets, 2,100 quote tweets, and 1,065 replies.

Number of Individual Accounts

These tweets were created by 8,363 unique accounts. Of these accounts, 1,396 used autism-related labels in their bios, 10 used DCD/dyspraxia-related labels in their bios, and 10 used autism- and DCD/dyspraxia-related labels in their bios. Six thousand nine hundred and forty-seven accounts did not include either label in their bios.

Keyword Overlap

Examining the keyword overlap across tweets, there were 11,151 tweets that used autism-related labels, 45 tweets that used DCD/dyspraxia-related labels, and three tweets that used both. Note that the total keyword count is less than the total number of tweets originally returned. This is due to a slight difference in how URL data is recorded when harvested from Twitter. In the initial search, the Twitter API has access to and searches through the full URL included in a tweet which sometimes contains keywords. The keyword labeling is performed on the data that is stored which only contain the shortened Twitter URL, which are a random string of letters and numbers.

Discussion

Social media platforms with publicly available data provide an excellent way for academic researchers to observe and directly sample the expressed perspectives of different populations over time. Twitter is a particularly accessible and rich source of data on the lived experiences of many clinical populations that have developed strong awareness, advocacy, and support networks on the platform. We demonstrated how publicly available Python code can be adapted to fit the needs of academic researchers to examine tweet content and user bio information, specifically in the context of our search for tweets and bio information containing terms related to autism and DCD/dyspraxia. Our pipeline for harvesting and analyzing Twitter data can be adapted to address a variety of research questions related to motor behavior. This approach provides a starting point for development of uniform best-practice guidelines that will help to unify methodology for use of social media data across fields including movement science, clinical neuroscience, and psychology.

We also provided multiple examples of possible analyses with a data set harvested using the code provided. With these analyses, we can begin to characterize discussions of autism and DCD/dyspraxia on Twitter as well as who is leading those discussions. Using the results from the number of tweets analysis, it was clear that tweets containing autism and/or DCD/dyspraxia terms received substantial engagement from other accounts, with the tweets averaging nearly 14 likes and more than one retweet, quote tweet, or reply. Using the results from the number of accounts analysis, we discerned that many of the accounts using autism and/or DCD/dyspraxia-related terms did not include these terms in their bios (83%), indicating that although they are using these terms, they did not include these terms to describe themselves biographically. Using the results from the keyword overlap analysis, we saw that the vast majority of the tweets used autism terms alone (98.9%), with only 0.4% of tweets using DCD/dyspraxia terms and 0.03% of tweets using both types of terms. Using the results of the analyses of the number of tweets and number of accounts together, we identified that the 11,277 tweets including content using autism and DCD/dyspraxia terms came from 8,363 unique accounts. This indicated that there were many Twitter accounts using these terms, but they used these terms in a limited number of tweets, approximately 1.35 tweets per account.

Although there are multiple methods for harvesting data from Twitter, the Twitter API and Python workflow are cost-effective and flexible. The Twitter API can be accessed by a wide range of researchers from graduate students to faculty, and it can be used to access a plethora of tweets and user information. This flexibility allows researchers to ask many different types of questions. Data from the Twitter API can be used to determine which accounts are tweeting the most about a particular population (e.g., [2], [11]), to identify specific topics relevant to a population (e.g., [18]; Tamplain et al., under review in this JMLD special section), or even to examine themes in moderated group discussions identified by dates, times, and hashtags (e.g., Chatterjee et al., under review in this JMLD special section). Given this flexibility, the breadth of the use of Twitter, and the fact that these data are currently free to access, our approach provides an incredible opportunity for gaining better insight into the lived experiences of clinical populations.

Despite the significant strengths of harvesting data from Twitter, there are some limitations. Twitter data is limited in the information researchers can determine about the demographics of the sample. Twitter users can be 13 years of age and up and can be located anywhere in the world, but it is not simple to verify a user's age or location. It is also challenging to determine users' clinical characteristics or diagnoses based solely on their user information and tweet content, although we can infer, or, assume, based on the terms they choose to use to describe themselves. To verify the demographics and diagnoses of Twitter users, researchers would need undertake a highly time-consuming verification on an account-by-account basis. It is also possible that users are not truthful in their self-presentation online. However, this is also a risk with traditional self-report methods, and at least one group has demonstrated that within the disabled community, virtual social identities typically parallel real-life identities ([24]). It is also challenging to extract and analyze nonlanguage-based tweet content, such as memes, images, and emojis. While natural language processing approaches can be used to rapidly process large amounts of text data, image analysis is considerably more complex and resource intensive, and inferences about the intended meaning of memes, videos, images, and emojis are highly subjective. Studies that include both image- and text-based data to assess disabled people's perspectives on their lived experiences may benefit from this multifaceted context ([10]). Similar complications arise when considering how to treat web links to external content, for example, a tweet that points to a blog post about a topic in greater detail. While the tweet itself may not contain a substantial amount of content related to a research question, the externally linked source might be highly relevant. At the moment, there are not best-practice guidelines for qualitative analysis of these more complex types of content.

In conclusion, the free Academic Research access to Twitter data via their API is an incredible resource that researchers can use to better understand the perspectives of the populations they study. Using the Twitter API in combination with Python—a free, open-source programming language—is a cost-effective and accessible approach for researchers from a wide range of institutional resource levels. Future studies should continue to advance our understanding of the lived experiences of clinical populations through the use of social media platforms such as Twitter. This new knowledge can help researchers to identify factors most relevant to improving the quality of life of clinical populations.

Fears https://orcid.org/0000-0001-7081-0015

Tamplain https://orcid.org/0000-0003-2713-5733

Miller (millerhl@umich.edu) is corresponding author, https://orcid.org/0000-0003-4372-1206

Acknowledgments

We paraphrase and refer to themes throughout the manuscript rather than using direct quotes, so that users' tweets are not subject to greater publicity than they may have reasonably expected, in keeping with best-practice guidelines for use of Twitter data in research (Williams et al., 2017). For this reason, content referenced in this manuscript is not attributed to individual users. Instead, we offer our gratitude to the entire #ActuallyAutistic community on Twitter for the opportunity to learn from their perspectives. <bold>Author Note:</bold> Out of respect for preferences expressed by many autistic self-advocates in our studies, in the literature (Bottema-Beutel et al., 2021 ; Botha et al., 2021) and in the community, we have chosen to use identity-first (rather than person-first) language throughout this manuscript when referring to autistic people. The DCD/dyspraxia community has not yet established best practices for language in the literature or expressed strong preferences in our studies, and so for consistency throughout the manuscript, we have chosen we use identity-first language when referring to this group as well. In doing so, it is not our intention to diminish or invalidate the preferences or perspectives of those who prefer person-first language. We recognize that identity is deeply personal and affirm that all individual preferences regarding the language used to express identity are valid and should be respected. We also use the term "motor differences" to describe features that may or may not cause problems or disability, depending on a person's goals, context, and access to appropriate supports or accommodations. We continue to welcome feedback on ways that we can effectively partner with the autistic community to advocate for respect, acceptance, inclusion, and representation in research. <bold>Funding</bold>: This research was supported in part by the National Institute of Mental Health (K01-MH107774) and the University of Michigan. <bold>Supplementary Information</bold>: Twitter Data Harvesting and Analysis GitHub Repository. https://github.com/MVDLab/JMLDTwitterAnalysis_Public.

Footnotes

1 We use identity-first language throughout the manuscript; please see the Author Note for additional information about the motivation for this choice.

REFERENCES

American Psychiatric Association. (2022). Diagnostic and statistical manual of mental disorders (5th ed.). https://doi.org/10.1176/appi.books.9780890425787

2 Bellon-Harn, M.L., Ni, J., & Manchaiah, V. (2020). Twitter usage about autism spectrum disorder. Autism, 24 (7), 1805–1816. https://doi.org/10.1177/1362361320923173

3 Beykikhoshk, A., Arandjelović, O., Phung, D., Venkatesh, S., & Caelli, T. (2015). Using Twitter to learn about the autism community. Social Network Analysis and Mining, 5 (1), 1–17. https://doi.org/10.1007/s13278-015-0261-5

4 Bhat, A.N. (2021). Motor impairment increases in children with autism spectrum disorder as a function of social communication, cognitive and functional impairment, repetitive behavior severity, and comorbid diagnoses: A SPARK study report. Autism Research, 14 (1), 202–219. https://doi.org/10.1002/aur.2453

5 Botha, M., Hanlon, J., & Williams, G.L. (2021). Does language matter? Identity-first versus person-first language use in autism research: A response to Vivanti. Journal of Autism and Developmental Disorders,1–9.

6 Bottema-Beutel, K., Kapp, S.K., Lester, J.N., Sasson, N.J., & Hand, B.N. (2021). Avoiding ableist language: Suggestions for autism researchers. Autism in Adulthood, 3 (1), 18–29.

7 Bruns, A., & Stieglitz, S. (2014). Twitter data: What do they represent?It-Information Technology, 56 (5), 240–245. https://doi.org/10.1515/itit-2014-1049

8 Caçola, P., Miller, H.L., & Williamson, P.O. (2017). Behavioral comparisons in autism spectrum disorder and developmental coordination disorder: A systematic literature review. Research in Autism Spectrum Disorders, 38, 6–18. https://doi.org/10.1016/j.rasd.2017.03.004

9 Clyne, W., Pezaro, S., Deeny, K., & Kneafsey, R. (2018). Using social media to generate and collect primary data: The# ShowsWorkplaceCompassion Twitter research campaign. JMIR Public Health and Surveillance, 4 (2), Article 7686. https://doi.org/10.2196/publichealth.7686

Cocq, C., & Ljuslinder, K. (2020). Self-representations on social media. Reproducing and challenging discourses on disability. Alter, 14 (2), 71–84. https://doi.org/10.1016/j.alter.2020.02.001

Crowson, M.G., Tucci, D.L., & Kaylie, D. (2018). Hearing loss on social media: Who is winning hearts and minds?The Laryngoscope, 128 (6), 1453–1461. https://doi.org/10.1002/lary.26902

Deriss, M.J. (2019). Review of topics related to autism spectrum disorder on Twitter. Network Modeling Analysis in Health Informatics and Bioinformatics, 8 (1), Article 3. https://doi.org/10.1007/s13721-019-0195-3

Fears, N.E., Palmer, S.A., & Miller, H.L. (2022). Motor skills predict adaptive behavior in autistic children and adolescents. Autism Research, 15 (6), 1083–1089. https://doi.org/10.1002/aur.2708

Hu, T., Wang, S., Luo, W., Zhang, M., Huang, X., Yan, Y.,... Li, Z. (2021). Revealing public opinion towards COVID-19 vaccines with Twitter data in the United States: Spatiotemporal perspective. Journal of Medical Internet Research, 23 (9), Article 30854. https://doi.org/10.2196/30854

Hunt, J., Zwicker, J.G., Godecke, E., & Raynor, A. (2021). Awareness and knowledge of developmental coordination disorder: A survey of caregivers, teachers, allied health professionals and medical professionals in Australia. Child: Care, Health and Development, 47 (2), 174–183. https://doi.org/10.1111/cch.12824

Kilroy, E., Ring, P., Hossain, A., Nalbach, A., Butera, C., Harrison, L.,... Cermak, S.A. (2022). Motor performance, praxis, and social skills in autism spectrum disorder and developmental coordination disorder. Autism Research, 15 (9), 1649–1664. https://doi.org/10.1002/aur.2774

Li, C., Ademiluyi, A., Ge, Y., & Park, A. (2022). Using social media to understand web-based social factors concerning obesity: Systematic review. JMIR Public Health and Surveillance, 8 (3), Article 25552. https://doi.org/10.2196/25552

Mantzalas, J., Richdale, A.L., Adikari, A., Lowe, J., & Dissanayake, C. (2022). What is autistic burnout? A thematic analysis of posts on two online platforms. Autism in Adulthood, 4 (1), 52–65. https://doi.org/10.1089/aut.2021.0021

Mazurek, M.O. (2013). Social media use among adults with autism spectrum disorders. Computers in Human Behavior, 29 (4), 1709–1714. https://doi.org/10.1016/j.chb.2013.02.004

Miller, H.L., Caçola, P.M., Sherrod, G.M., Patterson, R.M., & Bugnariu, N.L. (2019). Children with autism spectrum disorder, developmental coordination disorder, and typical development differ in characteristics of dynamic postural control: A preliminary study. Gait & Posture, 67, 9–11. https://doi.org/10.1016/j.gaitpost.2018.08.038

Miller, H.L., Sherrod, G.M., Mauk, J.E., Fears, N.E., Hynan, L.S., & Tamplain, P.M. (2021). Shared features or co-occurrence? Evaluating symptoms of developmental coordination disorder in children and adolescents with autism spectrum disorder. Journal of Autism and Developmental Disorders, 51 (10), 3443–3455. https://doi.org/10.1007/s10803-020-04766-z

Moral, E., Huete, A., & Díez, E. (2022) #MeCripple: Ableism, microaggressions, and counterspaces on Twitter in Spain. Disability & Society. Advance online publication. https://doi.org/10.1080/09687599.2022.2161348

Ngai, E.W., Tao, S.S., & Moon, K.K. (2015). Social media research: Theories, constructs, and conceptual frameworks. International Journal of Information Management, 35 (1), 33–44. https://doi.org/10.1016/j.ijinfomgt.2014.09.004

Noble, C.D. (2012). Cyberspace as equalizer: Opening up lifeworlds and empowering persons with disabilities in the Philippines (Doctoral dissertation). University of Hawaii at Manoa.

Saha, A., & Agarwal, N. (2016). Modeling social support in autism community on social media. Network Modeling Analysis in Health Informatics and Bioinformatics, 5 (1), 1–14. https://doi.org/10.1007/s13721-016-0115-8

Sarkar, T., Forber-Pratt, A.J., Hanebutt, R., & Cohen, M. (2021). Good morning, Twitter! What are you doing today to support the voice of people with #disability? Disability and digital organizing. Journal of Community Practice, 29 (3), 299–318. https://doi.org/10.1080/10705422.2021.1982802

Shakes, P., & Cashin, A. (2020). An analysis of twitter discourse regarding identifying language for people on the autism spectrum. Issues in Mental Health Nursing, 41 (3), 221–228. https://doi.org/10.1080/01612840.2019.1648617

Sinnenberg, L., Buttenheim, A.M., Padrez, K., Mancheno, C., Ungar, L., & Merchant, R.M. (2017). Twitter as a tool for health research: A systematic review. American Journal of Public Health, 107 (1), e1–e8. https://doi.org/10.2105/AJPH.2016.303512

Sweet, K.S., LeBlanc, J.K., Stough, L.M., & Sweany, N. (2020). Community building and knowledge sharing by individuals with disabilities using social media. Journal of Computer Assisted Learning, 36 (1), Article 12377. https://doi.org/10.1111/jcal.12377

Ward, D.M., Dill-Shackleford, K.E., & Mazurek, M.O. (2018). Social media use and happiness in adults with autism spectrum disorder. Cyberpsychology, Behavior, and Social Networking, 21 (3), 205–209. https://doi.org/10.1089/cyber.2017.0331

Wetsman, N. (2020). How Twitter is changing medical research. Nature Medicine, 26 (1), 11–13. https://doi.org/10.1038/s41591-019-0697-7

White, P., & Forrester-Jones, R. (2020). Valuing e-inclusion: Social media and the social networks of adolescents with intellectual disability. Journal of Intellectual Disabilities, 24 (3), 381–397. https://doi.org/10.1177/1744629518821240

Williams, M.L., Burnap, P., & Sloan, L. (2017). Towards an ethical framework for publishing Twitter data in social research: Taking into account users' views, online context and algorithmic estimation. Sociology, 51 (6), 1149–1168.

By Nicholas E. Fears; Riya Chatterjee; Priscila M. Tamplain and Haylie L. Miller

Reported by Author; Author; Author; Author