Consumers lead rich digital lives, and demand to get what they want, wheneverand however they want it. Brands and companies are re-calibrating their strategiesto win over a new type of informed and impatient connected consumer, byexpanding their portfolio to include more on-demand and personalized productsand services. How do we gain insight into the shortcomings and future offers companieswill make to this new consumer? The answer lies in millions of contentitems the consumer shares online. Twitter topic trending is a better indicator thanthe news on what collective trends are. People all over the world can express theiropinions, communicate with each other, discuss topics, spread news, and influenceeach other. The large number of users and messages shared every day create interestingdata and analyzing such data can bring extremely important insights aboutdifferent areas of study, such as human behavior, marketing, linguistics, industrytrends, brand monitoring, politics, etc. Here, we attempt to answer several questions regarding Twitter Data analysis at scale by producing a scalable end-to-end Twitter network data management pipeline that gathers, stores, and models rich relationships from Twitter networks, and analyzes Twitter data using a combination of graph-clustering and topic modeling techniques at scale using multiple data science methods for graph construction and tweet data processing.
Spontaneous and intentional digital Fake News wildfires over online social media can be as dangerous as natural fires. A new generation of data mining and analysis algorithms is required for early detection and tracking of such information cascades. This task focuses on the analysis of tweets related to Coronavirus and 5G conspiracy theories in order to detect misinformation spreaders. The task is part of the 2020 MediaEval Multimedia Evaluation benchmark. Our analysis assumes that people in the same social network community who agree on fake news also write in a similar style, discuss similar topics, produce similar content, and share similar values. We relate content of the tweets using lexical analysis, employ community discovery by building a network of re-tweets, and employ network analysis on structural data provided. [paper] [slides]