Machine Learning A2
Chinese MNIST Classification Experiments using Logistic Regression & Configurable MLP Baselines
Overview
This project investigates supervised classification on the Chinese MNIST dataset using two baseline model families:
- Logistic Regression (single linear layer) – a capacity baseline.
- Multi-Layer Perceptron (MLP) with configurable depth, width patterns, dropout, and optimizer sweeps.
The goal is to quantify how architectural depth/width, batch size, optimizer choice (SGD+Momentum vs Adam), and learning rate impact convergence behavior and generalization on a mid‑sized grayscale digit dataset with 15 target classes.
Key design principles:
- Reproducible training loops with early stopping (monitoring validation accuracy).
- Systematic hyperparameter sweeps with per‑run metric persistence (JSON + aggregated CSV).
- Strict, explicit checkpoint naming to enable automated parsing during model selection.
- Lightweight, interpretable baselines before exploring convolutional or more modern architectures.
Dataset
Chinese MNIST consists of grayscale images (handwritten Chinese numeral variants). The project pipeline:
Step | Action |
---|---|
1 | Download & extract data.zip (if missing) |
2 | Generate stratified splits: train / val / test (80 / 10 / 10) |
3 | Optional label variant control (value , value_character , or code ) |
4 | Apply light train-time augmentation: small rotation + affine translation |
5 | Transform → Grayscale → Resize (64×64) → ToTensor → [Normalize]* → [Flatten]* |
(*Normalization and flattening are configurable. Flattening is enabled for linear / MLP baselines.)
Augmentation is intentionally conservative to avoid distorting digit structure. All transforms are defined in preprocessing_utils.py
.
Models
Logistic Regression
Single Linear(input_dim, num_classes)
layer. Serves as a minimal baseline to detect whether the dataset is linearly separable in raw pixel space (after flattening).
MLP Baseline
Hidden layer blocks: Linear → BatchNorm1d → ReLU → Dropout
repeated per hidden layer. Output layer is linear (raw logits). Depth and width patterns explored include:
mlp_1x512
mlp_2x256
mlp_3x512-256-128
- (Easily extensible to 4+ layer tapered variants.)
Regularization: Dropout (0.5) + Weight Decay (5e-4). Adam and SGD+Momentum (0.9) compared across coarse→fine learning rates.
Experiments
Dimension | Values |
---|---|
Batch Size | 32, 256 |
Optimizers | Adam, SGD (momentum=0.9) |
Learning Rates | 0.01, 0.001, 0.0005 |
Epoch Ceiling | 30 (logreg), 50 (MLP) |
Early Stopping | Patience = 5 (val accuracy) |
Architectures | See model list above |
Each (architecture, batch_size, optimizer, lr)
combination produces a best‑validation checkpoint. File naming pattern examples:
checkpoints/log_reg/model_{batch}_{optimizer}_{lr}.pth
checkpoints/mlp/mlp_3x512-256-128_{batch}_{optimizer}_{lr}.pth
Hyperparameter metrics for every epoch are persisted:
- Per‑run JSON:
metrics/mlp/*.json
- Aggregated CSV:
metrics/mlp/runs_detailed.csv
This enables offline analytics (ranking, plotting without re‑training) and reproducible result summaries.
Evaluation & Selection
During the test phase:
- Checkpoints are enumerated and parsed via regex to recover architecture + hyperparameters.
- Each model is reconstructed and evaluated on the test loader matching its batch size.
- Best model selected by test accuracy (ties can be extended to include macro‑F1).
- Confusion matrix + performance curves for the best run are rendered to
plots/
. - A lightweight qualitative inference stage samples one test image per class for visual confirmation and class confidence inspection.
Metric focus: Accuracy (primary), Macro‑F1 (class balance), qualitative inspection (misclass patterns).
Quick Start
1. Environment
Install Python 3.10+ and required packages (PyTorch, torchvision, torchinfo, seaborn, scikit‑learn, pandas, numpy, matplotlib). If a requirements.txt
is added later, prefer that.
2. Run Logistic Regression Baseline
python log_reg.py
3. Run MLP Experiments
Open presentation.ipynb
(or future mlp_*.py
script) and execute sequentially, or adapt the training loops for CLI execution.
4. Inspect Outputs
- Checkpoints:
checkpoints/
- Metrics JSON/CSV:
metrics/
- Plots (curves & confusion matrices):
plots/
5. Inference (Best Model)
Run the later notebook cells (section “Inference”) or re‑load a checkpoint and call the predict_image
utility.
Repository Layout
├── log_reg.py # Logistic regression experiment script
├── mlp.py # Configurable MLP model definition
├── models_utils.py # Training / evaluation / plotting utilities
├── preprocessing_utils.py # Dataset prep, transforms, loaders, visualization
├── presentation.ipynb # Exploratory + training notebook (MLP focus)
├── checkpoints/ # Saved model weights (per config)
├── metrics/ # JSON + CSV run metrics
├── plots/ # Loss/accuracy curves & confusion matrices
├── data/ # Prepared splits + raw CSV
└── LICENSE # GPLv3 license
Possible Extensions
- Convolutional baselines (CNN) with non‑flattened transforms.
- Calibration analysis (reliability diagrams, ECE).
- Automated hyperparameter search (Optuna / Ray Tune).
- Model ensembling across top runs.
- Export for mobile / ONNX runtime inference.
License
This project is distributed under the terms of the GNU General Public License v3.0 (GPL‑3.0). See the LICENSE
file for full details.