Academic2025

KNN Classification: Vertebral Column Health Analysis

KNN classifier comparing 5 distance metrics for vertebral condition diagnosis. Analyzed 6 biomechanical features from UCI dataset to predict normal vs. abnormal spinal conditions.

Problem

Medical diagnosis of vertebral conditions requires accurate classification based on biomechanical measurements. Traditional approaches need systematic evaluation of different distance metrics and KNN configurations to optimize classification performance for clinical decision support.

Approach

Conducted comprehensive KNN analysis using the UCI Vertebral Column dataset with 6 biomechanical features (pelvic incidence, tilt, lumbar lordosis angle, sacral slope, pelvic radius, spondylolisthesis grade). Implemented binary classification (normal=0, abnormal=1) comparing five distance metrics: Euclidean, Manhattan (Minkowski p=1), Minkowski (variable p), Chebyshev, and Mahalanobis. Evaluated performance using confusion matrices, sensitivity/specificity, precision, F1-scores, learning curves, and weighted voting analysis. Built complete analysis pipeline in Jupyter Notebook with pandas, NumPy, scikit-learn, matplotlib, and seaborn.

Impact

Identified optimal distance metric and k-value combinations for vertebral condition classification through systematic comparison. Provided insights into trade-offs between different distance metrics for medical classification tasks, demonstrating how metric choice affects sensitivity vs. specificity in clinical contexts.

Key Metrics

5 compared
Distance Metrics
6 biomechanical
Features
Binary (Normal/Abnormal)
Classification
Multi-metric analysis
Evaluation
UCI ML Repository
Dataset

Technologies

Python 3.12Jupyter NotebookpandasNumPyscikit-learnmatplotlibseabornSciPy

Links

My Role

Sole developer - conducted exploratory data analysis, implemented KNN classifiers with multiple distance metrics, performed comparative evaluation using sensitivity/specificity/F1-scores, generated learning curves and confusion matrices, documented methodology and findings in Jupyter Notebook. Course project for DSCI 552 (Machine Learning) at USC.

Team Size: 1 person