# COGS-109-Modeling-and-Data-Analysis
This project uses Linear Regression and K-means Clustering to conduct an analysis on the Eating Habits dataset, which contains variables that determines obesity.
Research Focus:
Using exploratory linear regression and clustering, we aim to examine several attributes from the dataset to find which are the optimal indicators to predict the weight of an individual.
Dataset Information:
The dataset consists of data collected from individuals from Mexico, Peru, and Colombia. This data is useful for the estimation of the obesity levels based on eating habits and physical conditions. There are 2111 instances and 17 different attributes. Additionally, the data is classified using the values of Insufficient Weight, Normal Weight, Overweight Level I, Overweight Level II, Obesity Type I, Obesity Type II and Obesity Type III.
NOTE:
The main report can be found under: "COGS 109 Final Report.pdf"
The Jupyter Notebook Containing our code can be found under: "COGS 109 Final report.ipynb"
The Presentation Poster can be found under: "Obesity Analysis Poster"
The Dataset we used can be found under: "ObesityDataSet.csv"
***Dataset credits to UCI Machine Learning Repository: https://archive.ics.uci.edu/ml/datasets/Estimation+of+obesity+levels+based+on+eating+habits+and+physical+condition+
评论0