Trump and the Media: A sentiment analysis of news articles before and after his inauguration


Donald Trump’s relationship with the media has been a constant tug of war. Even before he became the President of the United States, the collective opinion about him in news kept shifting. While today the sentiment in the mainstream media seems overly negative towards Trump, a year ago it was less polarized.

The idea

In 2016, as I read news about the president’s campaign, his election and then his inauguration, I felt that there was a sudden shift in the way news media was talking about Trump, especially during the lead up to his inauguration and the first few weeks of his presidency. I wanted to see if data could prove my hypothesis that there was a shift in news sentiment towards Trump before and after his inauguration.

For the final project of our natural language processing class at Syracuse University, Daniela Fernández Espinosa from the Information School, James Troncale from the Linguistics Department and I, built a prototype sentiment analyzer to help political figures make better media strategy plans.

I visualized the results of that project and hosted it on my GitHub.

Text analysis

There have been multiple sentiment analysis done on Trump’s social media posts. While these projects make the news and garner online attention, few analyses have looked at the media itself. During the presidential campaign in 2016, Data Face ran a text analysis on news articles about Trump and Clinton. The results gained a lot of media attention and steered conversations. I planned to follow a similar approach.

Sentiment in the media

There isn’t a very big shift in polarity, but we can see that the percent of negative articles increased after Trump’s inauguration while the percent of positive and neutral articles went down. The high percent of neutral articles represents news media’s objectivity.

In the days leading up to the inauguration, a lot of what was written reflected the media's unsurety about what the presidency under inder Trump would look like. There was a curious mix of optimistic articles with cautionary ones. But immediately after his inauguration came the aftermath of the Russian hacking allegations followed by the 'travel ban' during which Trump lashed out at the media. The number of negative articles rose following these two incidents.


Framework: Python’s NLTK toolkit and its sentiment analyzer module.

Part of this project was training a Naive Bayes Classifier on a manually tagged set of articles about a political figure. In our case, we chose Trump because of the immense media attention given to him.

We collected around 2000 articles about Trump, one month before and after his inauguration from the following news sites: Chicago Tribune, CNN, FOX, LA Times, New York Times, Slate, Washington Post and Washington Times. We randomly selected 20% of our corpus and manually tagged the articles as Positive, Negative or Neutral. The final tag assigned to each article in the training set was the majority sentiment that was tagged by us. For example, if an article was tagged: Positive, Positive, Negative by the three of us individually, the final tag of that article was Positive. If we encountered a tie we would sit and revisit the article together and come to a consensus about the final tag.

After we had our tagged set, we ran a preliminary analysis on it to get an idea of what we were dealing with.

Tag set

As we thought, the number of negative articles increased after his inauguration, so we hypothesized that this would be true for our entire data set as well.

We created a few feature sets that we thought would help analyze the sentiment in a news article. Sentiment analysis on news is very subjective and each model will be different from the next. For our model, we created the following feature sets:

  • Quotes: Does the number of quotes in an article determine its sentiment?
  • No Punctuation: Do punctuations bear any weight on sentiment?
  • Exclamations and Question Marks: Does having an exclamation mark in the text affect the polarity of that text?
  • Word Polarity: Each word in the article is given a polarity score based on the MPQA lexicon and then the scores are added to determine the sentiment of the article
  • Adjective Polarity: Each adjective in the article is given a polarity score based on the MPQA lexicon and then the scores are added to determine the sentiment of the article
  • Stopwords: Do words like 'then', 'a', 'is', 'an', and so on have any say in the sentiment of an article?
  • Bigram: Taking two consecutive words to analyze sentiment. For example, the word-set ‘mexico’ and ‘border’ has a negative connotation in our data set.
  • Unigram: Counts every single word in the article. For example, if the word “wrong” appears in many articles tagged Negative, then the machine will assume that a new article with the word “wrong” in it will also be negative.

After defining our feature sets, we went on to test the accuracy of our model. In the end, the Adjective Polarity feature set scored the highest in terms of accuracy. That is what our model is based on. It takes every adjective in the text and assigns it a polarity score based on the MPQA word lexicon. Then the Negative, Positive and Neutral scores are tallied for each article and the sentiment with the highest score is the final tag for the article. We ran the rest of the database through our program and exported the results in a CSV. You can see all the results here.

While we were satisfied with our results, the sentiment analysis model itself is very subjective and hence the results here should be taken as nothing but than the outcome of an educational pursuit.

About the author

Mahima Singh is a data journalist at the Palm Beach Post in South Florida. Before that, she was with a news analysis and media criticism website in India. She has an MS in Computational Journalism from Syracuse University where she learned to programme in Python and hasn't stopped since. 

Explore the project here.

Image: torbakhopper.