Fake News Detection

March 26, 2026

The 2016 United States presidential election marked a turning point in the global conversation about misinformation. For the first time, the term 'fake news' entered mainstream political and media discourse as a documented, measurable phenomenon that shaped voter perceptions, drove social media engagement, and influenced electoral outcomes. Studies conducted after the election found that fabricated stories - many originating from politically motivated websites or foreign-operated influence operations - received millions of shares, often outperforming credible journalism in reach and engagement. The challenge, however, was not merely the existence of fake news. It was the speed at which it spread, the difficulty of distinguishing it from real news at a glance, and the lack of scalable tools to intercept or flag it in real time.

This project was motivated by that challenge. The business question driving my analysis is: can a machine learning system, trained on labeled examples of true and fake news from the 2016-2018 election cycle, learn to distinguish the two with meaningful accuracy - and in doing so, identify the linguistic, emotional, and topical fingerprints that separate misinformation from credible reporting? The answer has direct applications for social media platforms, news aggregators, fact-checking organizations, and election integrity agencies that need to evaluate the credibility of political content at scale and in near-real time.

You can view the source code for this project on my Github