SciBeh-Topic-Visualization

machine, twitter, learn, technology, application

Topic 12

machine twitter learn technology application extract prefer visualization interpersonal representation language wing classification detect introduction

Standardizing and Benchmarking Crisis-related Social Media Datasets for Humanitarian Information Processing
April 14, 2020 · · Original resource · article

Time-critical analysis of social media streams is important for humanitarian organizations to plan rapid response during disasters. The crisis informatics research community has developed several techniques and systems to process and classify big crisis data on social media. However, due to a variety of different datasets used in the literature, it is not possible to compare the results and to measure the progress made towards better models for crisis classification. In this work, we attempt to bridge this gap by providing a standard crisis-related dataset. We consolidate labels of 8 annotated data sources and provide 166.1k and 141.5k tweets for informativeness and humanitarian classification tasks. The consolidation also result in larger dataset size which is helpful in training stronger models. We also provide baseline results using CNN and BERT models. We make the dataset available at this https URL
analysis
social media
response
information
dataset
humanitarian
disaster
processing
organization
machine, twitter, learn, technology, application
library, open, preservation, psa, tool
Motif Discovery Algorithms in Static and Temporal Networks: A Survey
May 19, 2020 · · Original resource · preprint

Motifs are the fundamental components of complex systems. The topological structure of networks representing complex systems and the frequency and distribution of motifs in these networks are intertwined. The complexities associated with graph and subgraph isomorphism problems, as the core of frequent subgraph mining, have direct impacts on the performance of motif discovery algorithms. To cope with these complexities, researchers have adopted different strategies for candidate generation and enumeration, and frequency computation. In the past few years, there has been an increasing interest in the analysis and mining of temporal networks. These networks, in contrast to their static counterparts, change over time in the form of insertion, deletion, or substitution of edges or vertices or their attributes. In this paper, we provide a survey of motif discovery algorithms proposed in the literature for mining static and temporal networks and review the corresponding algorithms based on their adopted strategies for candidate generation and frequency computation. As we witness the generation of a large amount of network data in social media platforms, bioinformatics applications, and communication and transportation networks and the advance in distributed computing and big data technology, we also conduct a survey on the algorithms proposed to resolve the CPU-bound and I/O bound problems in mining static and temporal networks.
survey
computer science
network, complex, graph, multiplex, structure
machine, twitter, learn, technology, application
Cultural Convergence: Insights into the behavior of misinformation networks on Twitter
July 7, 2020 · · Original resource · preprint

How can the birth and evolution of ideas and communities in a network be studied over time? We use a multimodal pipeline, consisting of network mapping, topic modeling, bridging centrality, and divergence to analyze Twitter data surrounding the COVID-19 pandemic. We use network mapping to detect accounts creating content surrounding COVID-19, then Latent Dirichlet Allocation to extract topics, and bridging centrality to identify topical and non-topical bridges, before examining the distribution of each topic and bridge over time and applying Jensen-Shannon divergence of topic distributions to show communities that are converging in their topical narratives.
covid-19
communication
misinformation
modeling
network
twitter
news, misinformation, internet, medium, fake
machine, twitter, learn, technology, application
COVID-19 Public Sentiment Insights and Machine Learning for Tweets Classification

Along with the Coronavirus pandemic, another crisis has manifested itself in the form of mass fear and panic phenomena, fueled by incomplete and often inaccurate information. There is therefore a tremendous need to address and better understand COVID-19's informational crisis and gauge public sentiment, so that appropriate messaging and policy decisions can be implemented. In this research article, we identify public sentiment associated with the pandemic using Coronavirus specific Tweets and R statistical software, along with its sentiment analysis packages. We demonstrate insights into the progress of fear-sentiment over time as COVID-19 approached peak levels in the United States, using descriptive textual analytics supported by necessary textual data visualizations. Furthermore, we provide a methodological overview of two essential machine learning (ML) classification methods, in the context of textual analytics, and compare their effectiveness in classifying Coronavirus Tweets of varying lengths. We observe a strong classification accuracy of 91\% for short Tweets, with the Na\"ive Bayes method. We also observe that the logistic regression classification method provides a reasonable accuracy of 74\% with shorter Tweets, and both methods showed relatively weaker performance for longer Tweets. This research provides insights into Coronavirus fear sentiment progression, and outlines associated methods, implications, limitations and opportunities
covid-19
usa
machine learning
fear
policy support
machine, twitter, learn, technology, application
library, open, preservation, psa, tool
Data Visualization for Health and Risk Communication

In general, data visualization, especially advanced data visualization, has been viewed by practitioners from various disciplines as a powerful tool with the capacity to highlight the most important elements or key issues to improve understanding of the provided information. This chapter reviews the research on traditional static data visualization as well as advanced computer‐mediated interactive data visualization that has been conducted in the area of health and risk communication. Like traditional data visualizations, there are numerous types of interactive data visualization that have been adopted for presenting various types of health‐ and risk‐related information to the general public. The chapter discusses some of the theoretical frameworks that have been adopted in research on data visualization for health and risk communication. In health and risk communication research, a popular theoretical lens used to explain and predict the effectiveness and/or ineffectiveness of fear appeals is the extended parallel process model.
communication
risk
health
review
research
data visualization
interactive
machine, twitter, learn, technology, application
library, open, preservation, psa, tool
Understanding the Shape of Large-Scale Data
May 5, 2020 · · Original resource · blog

Understanding the differences and similarities between complex datasets is an interesting challenge that often arises when working with data. One way to formalize this question is to view each dataset as a graph, a mathematical model for how items relate to each other. Graphs are widely used to model relationships between objects — the Internet graph connects pages referencing each other, social graphs link together friends, and molecule graphs connect atoms bonding with each other.
modeling
data analysis
google
graph
data visualization
mathematics
dataset
learning
relationship
network, complex, graph, multiplex, structure
machine, twitter, learn, technology, application
Capturing and analyzing social representations. A first application of Natural Language Processing techniques to reader’s comments in COVID-19 news. Argentina, 2020

We present a first approximation to the quantification of social representations about the COVID-19, using news comments. A web crawler was developed for constructing the dataset of reader’s comments. We detect relevant topics in the dataset using Latent Dirichlet Allocation, and analyze its evolution during time. Finally, we show a first prototype to the prediction of the majority topics, using FastText.
comment
covid-19
analysis
natural language processing
news
news, misinformation, internet, medium, fake
machine, twitter, learn, technology, application
COVID-19 Public Sentiment Insights and Machine Learning for Tweets Classification
May 2, 2020 · · Original resource · preprint

Along with the Coronavirus pandemic, another crisis has manifested itself in the form of mass fear and panic phenomena, fuelled by incomplete and often inaccurate information. There is therefore a tremendous need to address and better understand COVID-19's informational crisis and gauge public sentiment, so that appropriate messaging and policy decisions can be implemented. In this research article, we identify public sentiment associated with the pandemic using Coronavirus specific Tweets and R statistical software, along with its sentiment analysis packages. We demonstrate insights into the progress of fear-sentiment over time as COVID-19 approached peak levels in the United States, using descriptive textual analytics supported by necessary textual data visualizations. Furthermore, we provide a methodological overview of two essential machine learning classification methods, in the context of textual analytics, and compare their effectiveness in classifying Coronavirus Tweets of varying lengths. We observe a strong classification accuracy of 91% for short Tweets, with the Naive Bayes method. We also observe that the logistic regression classification method provides a reasonable accuracy of 74% with shorter Tweets, and both methods showed relatively weaker performance for longer Tweets. This research provides insights into Coronavirus fear sentiment progression, and outlines associated methods, implications, limitations and opportunities.
covid-19
usa
misinformation
twitter
crisis
r
tweet
machine learning
fear
package
machine, twitter, learn, technology, application
library, open, preservation, psa, tool
Identifying social media manipulation with OSoMe tools
Aug. 11, 2020 · · Original resource · youtube

As social media become the major platforms for discussions of important topics like national politics, public health, and environmental policy, there is a growing concern about the manipulation of these information ecosystems and their users. Malicious techniques include astroturf, amplification of misinformation, and trolling. Such abuses can be carried out by humans as well as social bots --- inauthentic accounts controlled in part by software. The resulting biased reality can fool even professional researchers. While researchers are increasingly interested in detecting and studying these malicious activities, there are serious challenges. First, the collection and analysis of data from social media require significant storage and computing resources. Second, knowledge, experience, and advanced computational skills are necessary to find patterns and signals of suspicious behaviors in large datasets. In this tutorial, we will present free tools from the Observatory of Social Media (OSoMe, pronounced “awe·some”) at Indiana University. We will focus on three tools that aim to help researchers and the general public combat online manipulation: Botometer, which helps detect social bots on Twitter; Hoaxy, which can track and visualize the diffusion of misinformation; and BotSlayer, which helps track and detect potential manipulation of information spreading on Twitter in real time. These tools are equipped with state-of-the-art algorithms and carefully designed user interfaces. They also provide public APIs to allow querying in bulk. They have served as the foundation for hundreds of research papers, and have helped thousands of users combat manipulation on social media.
covid-19
tracking
communication
misinformation
bias
threat
social media
detection
algorithm
video
tool
manipulation
tutorial
webinar
news, misinformation, internet, medium, fake
machine, twitter, learn, technology, application
The Four Dimensions of Social Network Analysis: An Overview of Research Methods, Applications, and Software Tools

Social network based applications have experienced exponential growth in recent years. One of the reasons for this rise is that this application domain offers a particularly fertile place to test and develop the most advanced computational techniques to extract valuable information from the Web. The main contribution of this work is three-fold: (1) we provide an up-to-date literature review of the state of the art on social network analysis (SNA); (2) we propose a set of new metrics based on four essential features (or dimensions) in SNA; (3) finally, we provide a quantitative analysis of a set of popular SNA tools and frameworks. We have also performed a scientometric study to detect the most active research areas and application domains in this area. This work proposes the definition of four different dimensions, namely Pattern & Knowledge discovery, Information Fusion & Integration, Scalability, and Visualization, which are used to define a set of new metrics (termed degrees) in order to evaluate the different software tools and frameworks of SNA (a set of 20 SNA-software tools are analyzed and ranked following previous metrics). These dimensions, together with the defined degrees, allow evaluating and measure the maturity of social network technologies, looking for both a quantitative assessment of them, as to shed light to the challenges and future trends in this active area.
technology
application
social network
dimension
machine, twitter, learn, technology, application
library, open, preservation, psa, tool