This section explains how to run the Student Grade Prediction pipeline, generate sample input data, and make predictions.
Clone the repository:
git clone https://github.com/yourusername/student-grade-prediction.git
cd student-grade-prediction
Install dependencies:
pip install -r requirements.txt
Train and evaluate a model using the CLI:
# Train on math dataset
python src/main.py --dataset math --model random_forest
# Train on portuguese dataset
python src/main.py --dataset portuguese --model random_forest
results/metrics/*.json
results/models/*.pkl
results/logs/project.log
Use the helper script to create valid input files for prediction (they match the training schema exactly):
python src/generate_sample_data.py
This creates:
data/new_data_math.csv
→ 5 rows from the math dataset (without G3
)data/new_data.csv
→ 5 rows from the Portuguese dataset (without G3
)Run predictions on new data with a trained model:
# Predict with the math model
python src/predict.py --model results/models/random_forest_math.pkl --data data/new_data_math.csv --out results/predictions
# Predict with the Portuguese model
python src/predict.py --model results/models/random_forest_portuguese.pkl --data data/new_data.csv --out results/predictions
results/predictions/predictions.csv
Example output:
First 10 predictions: [10.82 11.06 13.07 14.54 13.02]
⬅️ Back: Architecture | ➡️ Next: Results |