Here is a brief overview of each of the articles I wrote for my Web Mining class at NJIT, taught by Professor Cody Buntain.

An Exploratory Data Analysis (EDA) on the CDC’s COVID-19 Data in the Tri-State Area

In this first article, we explore a dataset of Coronavirus cases and deaths in New York, New Jersey, and Connecticut, obtained from the Center for Disease Control and Prevention. …

Data Science Report

A Comparative Analysis of 3 Separate Datasets

By Doug Rizio, Trishala Suryavanshi, Mohammed Yahya, and Varun Garg

COVID-19, the novel Coronavirus. First detected in late December of 2019 as a viral outbreak in Wuhan China, this mysterious new disease with pneumonia-like symptoms quickly spread throughout the rest of Mainland China and ultimately infected every major population center around the world. As of May 12th 2021, 159,319,384 confirmed cases of COVID-19 around the world have been reported to the WHO, including 3,311,780 deaths. In the United States, the country with the greatest number of total COVID cases at 32,424,637, an estimated 576,814 people have died so far.


Data Science Tutorial

With Collaborative Filtering, Implicit Data Collection, K Nearest Neighbors, and Cosine Similarity

Reddit — AKA, “The Front Page of the Internet.” Just over 15 years ago, Reddit began as a small and little-known website made by college students, featuring anonymous forums or “subreddits” for topics mostly related to science and programming. Now, Reddit is one of the largest social networking platforms in the world, with an estimated 430,000,000 active users and 100,000 active subreddits currently online as of early 2021.

Users on Reddit submit everything from pictures, animated GIF’s, and videos to text-based opinion posts and links to news stories. While some of the content on Reddit is original, much of it…

Data Science Tutorial

K-Means Clustering on Twitter Politicians

When social media platforms like Facebook and Twitter were first created a decade and a half ago, few could have predicted how influential they would ultimately become in shaping the political landscape of the modern world. [1]

Today, however, just about every politician or public figure recognizes the importance of an online presence. Barack Obama is sometimes known as the first social media President, having leveraged the burgeoning power of Facebook to expand his reach beyond the scope of traditional media, quickly cultivating his image from a relatively anonymous senator to a national icon of hope and change in 2008…


Even in the earliest days of the Coronavirus pandemic, many people took to social media platforms like Twitter to share their thoughts about the deadly disease. Questions about COVID’s spread in China, concerns about its rate of mortality, debates about government responses, and conspiracy theories about its origins were just a few of the many related topics talked about online. Just like with any popular subject, users often found themselves at odds with each other, arguing in favor of their own opinions against the opposing side, and leading public discourse throughout the country.

Given the increasing political polarization in the…


Social media platforms have become some of the most popular technologies of the decade. Over 72% of Americans use some type of social media as of 2019, with a significant amount of people visiting the platforms every single day, and large numbers of users spending several hours on them daily. [1] Between YouTube, Facebook, Instagram, Reddit, and Twitter, the uses of social media are many and varied — from uploading videos, posting selfies, and sharing memes to staying in contact with friends and family or making connections with online communities.

However, the social media sphere is also becoming an important…

Disclaimer: this EDA studies the total numbers of COVID-19 cases and deaths in each state, not cases and deaths per capita. Its primary purpose is to make observations on general trends in the changes of cases and deaths over time.

The Crisis of Coronavirus

Next to the presidential election, COVID-19 was undoubtedly the most defining event of 2020. The novel Coronavirus is an unprecedented pandemic on a global scale, and its ongoing impact on the world not only sets the stage for the next few years, it may also determine the fate of the entire decade. Many temporary measures being put into place to…

Doug Rizio

