Financial AI Model Study
Compared regression, dimensionality-reduction, tree, support-vector, and neural-network approaches to understand when model complexity improves generalization.
Period · Fall 2024
Role · Individual project series
01 · Business / research question
The question
How do preprocessing, regularization, model complexity, and hyperparameters affect validation and test performance?
02 · Evidence
What the analysis used
03 · Analysis
How the work progressed
- 01
Test model complexity
Compared polynomial degrees and decision-tree settings using held-out performance.
- 02
Examine representation and regularization
Applied PCA, standardization, Ridge, and Lasso to understand structure and overfitting.
- 03
Tune support-vector models
Evaluated Linear SVR parameter combinations on Iowa housing data.
- 04
Compare neural-network depth
Compared linear regression, a one-hidden-layer ANN, and a three-hidden-layer DNN.
04 · Interpretation
Main insight
The first two reported PCA components explained approximately 88.4% of variance in the national-risk example.
Linear SVR with C=1 and epsilon=50 was strongest among the tested combinations.
Additional DNN depth produced little improvement over the smaller ANN.
05 · Practical decision
Decision value
Use additional complexity only when validation evidence justifies it; a more advanced method is not automatically the stronger business choice.
06 · Validation
Limitations and next checks
- •Original datasets and data dictionaries should be recovered.
- •The four assignments should be consolidated into reproducible notebooks.
- •Classroom metrics should not be presented as production performance.
07 · Visual evidence
Evidence, with boundaries
Polynomial degree and validation RMSE
Reported evidenceThe bars reproduce validation RMSE values reported in assignment 1. They do not by themselves establish a preferred final degree; training and test behavior still need a reproducible notebook.
Source · Financial AI assignment 1 report
PCA explained variance
Reported evidencePC1 and PC2 explain 64.03% and 24.37%, about 88.4% combined in the reported national-risk example. This is classroom evidence for that dataset, not general financial-model performance.
Source · Financial AI assignment 1 report · national-risk example
Linear SVR tuning result
Reported evidenceC
1.0
epsilon
50
Train MSE
1,782.2306
Validation MSE
1,944.5111
Among the tested combinations, C=1 and epsilon=50 produced the lowest reported validation MSE. This is a bounded classroom experiment, not evidence of external or production performance.
Source · Financial AI assignment 3 report · Iowa housing example
ANN and DNN validation behavior
Reported evidenceANN
One hidden layer · comparison baseline
DNN
Three hidden layers · limited gain from added complexity
The report concludes that the deeper DNN did not deliver a meaningful improvement over the smaller ANN. Because minimum-loss figures are not fully consistent within the document, only the qualitative conclusion is shown.
Source · Financial AI assignment 4 report