Understanding the Environmental Factors that Contribute to Spread of COVID-19
Data Mining
Attempt to use Data Mining to understand the environmental factors that contribute the most to the spread of COVID-19, made in March 2020
About the project
This project aims to provide insight into the local factors that contribute to the spread of COVID-19 in a given area. To achieve this, we collected data from multiple sources and built a comprehensive dataset containing information on various factors for different locations. We determined these factors by looking at those that typically contribute to the spread of other infectious diseases. Next, we used a multivariable linear regression learning model to find the importance of each attribute in relation to the rate of spread, which was our pivot attribute. The model was built using Ruby, specifically using Ruby's Matrix data structure. The model used the normal equation to learn attribute weights and dynamically adapted to the number of independent attributes. It also dynamically handled the location of the pivot attribute within each row. We also utilized basic data visualization techniques to further explore the data, particularly for variables that were assigned high importance or piqued our interest.
Please note that the code for this project is not available online, due to the updated University of Rochester Academic Honesty policy.
Click on "Learn More" to read the detailed paper about this project.