DATASET AND FRAMEWORK DESCRIPTION OF LATEST NEWS
news and media: This section begins by describing the dataset used in this study for identifying the behaviors of news media in terms of their published content. There was no previous work that has analyzed news media popularity based on content originality and on their interactions. To this end, we have identified 48
news media popular worldwide and their user accounts in Facebook and Twitter. The framework described in section II-C was then used to detect originators.
A. Data Collection
The dataset used in this study is based on the top 48 most
May 2017. The list is based on the amount of traffic, the number of unique visitors, recorded for each media. We manually identified a set of active and authorized
news media user accounts that officially represent the media sites in both Facebook and Twitter. We considered a single account per news media: the most popular news media
account with the highest number of followers compared to other accounts who represent the same news media.
Next, for each media user, we executed Facebook and Twitter crawlers (for one month: 8th
May – 8th June 2017) to extract timeline posts, respective timestamps, and number of
likes, shares, comments, retweets and favorites.
B. Dataset Description
A brief description of the dataset acquired from Twitter and Facebook is shown in Table I, contains information about the average number of followers and total number of posts shared.
It is clear that the largest number of followers are from the news media on Facebook compared with them on Twitter. In Facebook, highest number of posts (10.6K) are shared by the Indianexpress and Hindustantimes(6.7K). In Twitter, top 10 news media that have published large number of news items
Are Bloomberg, Cnn, Indiatimes, Foxnews, Hindustantimes, Indianexpress, Theguardian, Thehill, Time, and Economictimes.The least number of posts were shared byRedditon Facebook.
The violon plot in Figure 1 shows further interpretation on the dataset. Figure 1(a) illucidates the distribution among number of items posted on Facebook and Twitter by all
Social Media News
in social media and their news reader interactions. news media, which illustrates the abstract representation of the probability distribution of the dataset based on the symmetrical
kernel density estimation (KDE). Figure 1(a) clearly presents that in average, number of tweets shared by these news media is higher than that on Facebook. In spite of the fact that
the highest number of followers are from Facebook news media accounts compared to the same set of the media representatives in Twitter (as indicated in Table I), number of
posts shared by them on Facebook is lower than the number of tweets. Some preliminary work carried out in the recent
years have also proven the same that Twitter has been widely used as a source of than Facebook.
The distribution of the average number of reader reactions for all the posts in our dataset is shown in Figure 1 with the box-plots for Twitter (Figure 1(b)) and Facebook (Figure 1(c)).
It is apparent from Figure 1(c) that news consumers reacted more on Facebook posts than on tweets and it is obvious that the largest number of followers are from Facebook (Table I).
As a summary, based on our analysis, 77% of the news media were more active on Twitter than on Facebook, as manifested in the previous works  .Indianexpress and
Hindustantimeswere used both social networks very actively. A few others,Reddit and Ipsnews, have only a minor engagement with social media compared to other media, and the
remainder mostly use either Facebook or Twitter exclusively. Apart from that, although news media sites in Facebook do not share as much content as in Twitter, number of reader reactions
in Facebook is considerably higher than that of Twitter.
C. Framework Description of news
The analysis of the news media dataset enable us to investigate interactions among them on social media, in particular,
which news media produce, distribute, and consume the
items. The detection of the content originator of texts is an
important asset to observe information propagation patterns
among different news media especially in the context of
sharing the replicas by different news media. In that regard,
framework Contents and News
is capable of detecting content originator of items. The ConOrigina framework was
designed to identify textual content originator in OSNs based
on the SCAP method to detect linguistic patterns and online
circadian to detect temporal patterns of the user.