Home » Articles posted by Jonathan Asanjarani
Author Archives: Jonathan Asanjarani
List of Tables
Table 1 Basic Descriptives of the cleveland training data Table 2 Variable descriptives based on heart disease presence Table 3 Correlation Statistic between individual variables and heart disease presence Table 4 Model Results Figure 1
Data Management Plan Overview
Our data management plan ensures the organized, secure, and ethical handling of all project data. We will acquire datasets from the UCI Machine Learning Repository and follow their terms of use. The data will be stored securely on a personal computer. We will document all data processing steps, including cleaning, transformation, and analysis, ensuring transparency […]
Digital References
Sex-Specific and Regional Analysis of Heart Disease Prediction Using Machine Learning Algorithms: Insights from the UCI Irvine Public Heart Disease Datasets (Cleveland and Long Beach)Jonathan AsanjaraniCity University of New York Graduate CenterDATA 79000: Capstone Project and ThesisAdvisor: Johanna DevaneyNovember 25th, 2024 Software and Tools Used Datasets Guidelines and Methodological References Additional Resources for Citing Software […]
A Note on Technical Specifications
This project used Google Collab as the development environment. Google Collab is a cloud-based Python platform providing access to GPUs for accelerated computation. Python (version 3.8) was used in the Google Collab environment, with additional libraries and frameworks included, such as Scikit-learn, XGBoost, Pandas, NumPy, Matplotlib, and Seaborn, as detailed in the References section. The […]
Data Dictionary
Sex-Specific and Regional Analysis of Heart Disease Prediction Using Machine LearningAlgorithms: Insights from the UCI Irvine Public Heart Disease Datasets (Cleveland and LongBeach)Jonathan AsanjaraniCity University of New York Graduate CenterDATA 79000: Capstone Project and ThesisAdvisor: Johanna DevaneySignificant Variables
Digital Manifest
Sex-Specific and Regional Analysis of Heart Disease Prediction Using Machine Learning Algorithms: Insights from the UCI Irvine Public Heart Disease Datasets (Cleveland and Long Beach)Jonathan AsanjaraniCity University of New York Graduate CenterDATA 79000: Capstone Project and ThesisAdvisor: Johanna Devaney Project Components 1. Capstone Report (Print and Digital) 2. Exploratory Data Analysis (EDA) Notebook 3. Machine […]
Discussion & Findings
Discussion Key findings: My project leveraged the Cleveland and VA Long Beach datasets, in the “Heart Disease” database, which was donated to the UCI Machine Learning Repository to explore the binary classification of heart disease presence, using the available demographic and clinical features. Through exploratory data analysis (EDA), data cleaning, transformation experiments, and model […]
ASCVD (Atherosclerotic Cardiovascular Disease) Risk Score (Cleveland And VA Long Beach)
Atherosclerotic Cardiovascular Disease Risk Calculation on Cleveland Dataset The 2013 ASCVD (Atherosclerotic Cardiovascular Disease) risk score was evaluated on the Cleveland dataset, yielding key performance metrics. The score achieved an accuracy of 69.64%, indicating that approximately 70% of predictions matched actual outcomes. Precision was 63.58%, reflecting the proportion of correctly identified positive cases among all […]
Male Vs. Female
Is the Best Performing Models More Effective for Male vs. Female Population? The highest-performing models were identified in Experiment 2, showcasing robust predictive capabilities. The Random Forest classifier emerged as the top performer, achieving a mean accuracy of 88.33%, a mean precision of 91.79%, a mean recall of 82.00%, and a mean F1-score of 83.67%. […]
Transformation 3: Cleveland Only
Optimizing Feature Engineering In this third experiment, the focus is on enhancing the feature engineering component to improve model performance through targeted transformations. The following transformations were applied: (1) a logarithmic transformation for Resting Blood Pressure (trestbps) and Cholesterol (chol) to reduce skewness and stabilize variance; (2) a squared transformation of Maximum Heart Rate (thalach), […]



Recent Comments