Watt-The-Hack · Model Architecture

MODEL ARCHITECTURE

How the prediction engine was built

🧠

How does the model work?

Instead of using a single algorithm, this solution uses a stacking ensemble — three different AI models (LightGBM, CatBoost, XGBoost) each make their own prediction independently. Then a fourth model (RidgeCV) learns the best way to combine those three predictions into one final answer. This "wisdom of crowds" approach consistently outperforms any single model on its own. Training used 5-fold cross-validation to prevent overfitting — meaning the data was split 5 ways and each split was used to validate the others.

⚙️

Three models, each making independent predictions

Each bar shows how much that model contributes to the final answer. All three are gradient boosting algorithms — they build many small decision trees one after another, each one correcting the mistakes of the last.

Base Models

LGB

LightGBM

n_estimators=400, lr=0.03, depth=7, subsample=0.9

88%

CAT

CatBoost

iterations=400, lr=0.03, depth=7, verbose=0

85%

XGB

XGBoost

n_estimators=400, lr=0.03, depth=7, subsample=0.9

86%

RCV

RidgeCV META-LEARNER

alphas=logspace(-2,2,10), cv=5 — combines base predictions

99%

🔧

How raw inputs were transformed before training

Raw data rarely goes straight into a model. These 5 steps cleaned, enriched, and normalised the data to help the model learn better patterns.

Feature Engineering Pipeline

Raw Inputs

5 component fractions + 5×10 component properties = 55 features

Weighted Averages

WA_Property_i = Σ(fraction_c × property_c,i) · 10 engineered features

Outlier Removal

IQR-based (1.5×) per target to clean training distribution

Quantile Transform

100-quantile normal output distribution on all features

5-Fold Stacking

Out-of-fold predictions feed meta-learner to prevent data leakage

🏆

Final scores after combining all three models

These are the R² scores achieved by the full stacking ensemble on the test set — one score per blend property. The closer to 1.0, the better. 9 out of 10 properties score above 0.99, which is exceptional for a real-world chemistry prediction task.

Final R² Scores — Stacked Ensemble