Treffer: Graph Data Modeling for Political Communication on Twitter

Title:
Graph Data Modeling for Political Communication on Twitter
Authors:
Contributors:
Wallapak Tavanapong, Department of Computer Science
Source:
archive/lib.dr.iastate.edu/etd/15949/Kumar_iastate_0097M_16251.pdf|||Fri Jan 14 20:49:01 UTC 2022
Publication Year:
2016
Collection:
Digital Repository @ Iowa State University
Subject Terms:
Document Type:
Dissertation thesis
File Description:
application/pdf
Language:
English
Relation:
archive/lib.dr.iastate.edu/etd/15949/; 6956; etd/15949; https://dr.lib.iastate.edu/handle/20.500.12876/30132
DOI:
10.31274/etd-180810-5576
Accession Number:
edsbas.11E20FAC
Database:
BASE

Weitere Informationen

Twitter has become a political reality where political parties, presidential candidates, legislatures and journalists post tweets about the latest events sharing texts, pictures, hashtags, URLs, and mentioning other users. Gaining insight from the vast amount of political data on Twitter is only possible with proper computational tools. We propose to store and manage Twitter data in an optimized Neo4j graph database for serving queries about political communication among state legislators of 50 U.S. states, state reporters, and presidential candidates for the 2016 presidential election. Our rationale for selecting this relatively new database technology is threefold: (1) ease of use in explicitly modeling and visualizing communication relationships among entities of interest; (2) flexibility to evolve the database overtime to quickly adapt to changes in user requirements; and (3) user-friendly intuitive query interface. We developed a Python-based Google App Engine application using Twitter API to collect tweets from the Twitter’s handlers of the aforementioned political actors. We employed best practice guidelines in graph database design to develop five different database models in order to distinguish the impact of each query optimization technique. We evaluated each of the models on the same set of tweets posted during January 1, 2016 to November 11, 2016 using the same set of queries of interest to political communication scholars in terms of the average query response times. Our experimental results confirmed the benefits of the best practice design guidelines. In addition, they show that the optimized database model is able to provide significant improvement in query response times. Reducing the number of hops used in the graph queries and using database indexes on most commonly used attributes reduced the average query response time in our dataset by as much as 74.52% and by 85.27%, respectively, compared to the reference model. Nevertheless, the reduction in the average query response time comes with ...