Case 05Applied Analytics · Model Selection

Financial AI Model Study

Compared regression, dimensionality-reduction, tree, support-vector, and neural-network approaches to understand when model complexity improves generalization.

Period · Fall 2024

Role · Individual project series

01 · Business / research question

The question

How do preprocessing, regularization, model complexity, and hyperparameters affect validation and test performance?

02 · Evidence

What the analysis used

Four applied assignments using financial and public datasets

Training, validation, and test comparisons

RMSE, MSE, ROC-AUC, scree plots, tuning surfaces, and loss curves

Pythonscikit-learnTensorFlow/KerasPCASVM/SVRANN/DNN

03 · Analysis

How the work progressed

01
Test model complexity
Compared polynomial degrees and decision-tree settings using held-out performance.
02
Examine representation and regularization
Applied PCA, standardization, Ridge, and Lasso to understand structure and overfitting.
03
Tune support-vector models
Evaluated Linear SVR parameter combinations on Iowa housing data.
04
Compare neural-network depth
Compared linear regression, a one-hidden-layer ANN, and a three-hidden-layer DNN.

04 · Interpretation

Main insight

The first two reported PCA components explained approximately 88.4% of variance in the national-risk example.

Linear SVR with C=1 and epsilon=50 was strongest among the tested combinations.

Additional DNN depth produced little improvement over the smaller ANN.

05 · Practical decision

Decision value

Use additional complexity only when validation evidence justifies it; a more advanced method is not automatically the stronger business choice.

06 · Validation

Limitations and next checks

•Original datasets and data dictionaries should be recovered.
•The four assignments should be consolidated into reproducible notebooks.
•Classroom metrics should not be presented as production performance.

07 · Visual evidence

Evidence, with boundaries

Polynomial degree and validation RMSE

Reported evidence

Degree 144,167

Degree 229,607

Degree 327,122

Degree 426,114

Degree 524,594

The bars reproduce validation RMSE values reported in assignment 1. They do not by themselves establish a preferred final degree; training and test behavior still need a reproducible notebook.

Source · Financial AI assignment 1 report

PCA explained variance

Reported evidence

PC164.03%

PC224.37%

PC39.77%

PC41.83%

PC1 and PC2 explain 64.03% and 24.37%, about 88.4% combined in the reported national-risk example. This is classroom evidence for that dataset, not general financial-model performance.

Source · Financial AI assignment 1 report · national-risk example

Linear SVR tuning result

Reported evidence

1.0

epsilon

Train MSE

1,782.2306

Validation MSE

1,944.5111

Among the tested combinations, C=1 and epsilon=50 produced the lowest reported validation MSE. This is a bounded classroom experiment, not evidence of external or production performance.

Source · Financial AI assignment 3 report · Iowa housing example

ANN and DNN validation behavior

Reported evidence

ANN

One hidden layer · comparison baseline

DNN

Three hidden layers · limited gain from added complexity

The report concludes that the deeper DNN did not deliver a meaningful improvement over the smaller ANN. Because minimum-loss figures are not fully consistent within the document, only the qualitative conclusion is shown.

Source · Financial AI assignment 4 report

← Portfolio

한국어

Case 05Applied Analytics · Model Selection

Financial AI Model Study

Compared regression, dimensionality-reduction, tree, support-vector, and neural-network approaches to understand when model complexity improves generalization.

Period · Fall 2024

Role · Individual project series

01 · Business / research question

The question

How do preprocessing, regularization, model complexity, and hyperparameters affect validation and test performance?

02 · Evidence

What the analysis used

Four applied assignments using financial and public datasets

Training, validation, and test comparisons

RMSE, MSE, ROC-AUC, scree plots, tuning surfaces, and loss curves

Pythonscikit-learnTensorFlow/KerasPCASVM/SVRANN/DNN

03 · Analysis

How the work progressed

01
Test model complexity
Compared polynomial degrees and decision-tree settings using held-out performance.
02
Examine representation and regularization
Applied PCA, standardization, Ridge, and Lasso to understand structure and overfitting.
03
Tune support-vector models
Evaluated Linear SVR parameter combinations on Iowa housing data.
04
Compare neural-network depth
Compared linear regression, a one-hidden-layer ANN, and a three-hidden-layer DNN.

04 · Interpretation

Main insight

The first two reported PCA components explained approximately 88.4% of variance in the national-risk example.

Linear SVR with C=1 and epsilon=50 was strongest among the tested combinations.

Additional DNN depth produced little improvement over the smaller ANN.

05 · Practical decision

Decision value

Use additional complexity only when validation evidence justifies it; a more advanced method is not automatically the stronger business choice.

06 · Validation

Limitations and next checks

•Original datasets and data dictionaries should be recovered.
•The four assignments should be consolidated into reproducible notebooks.
•Classroom metrics should not be presented as production performance.

07 · Visual evidence

Evidence, with boundaries

Polynomial degree and validation RMSE

Reported evidence

Degree 144,167

Degree 229,607

Degree 327,122

Degree 426,114

Degree 524,594

The bars reproduce validation RMSE values reported in assignment 1. They do not by themselves establish a preferred final degree; training and test behavior still need a reproducible notebook.

Source · Financial AI assignment 1 report

PCA explained variance

Reported evidence

PC164.03%

PC224.37%

PC39.77%

PC41.83%

PC1 and PC2 explain 64.03% and 24.37%, about 88.4% combined in the reported national-risk example. This is classroom evidence for that dataset, not general financial-model performance.

Source · Financial AI assignment 1 report · national-risk example

Linear SVR tuning result

Reported evidence

1.0

epsilon

Train MSE

1,782.2306

Validation MSE

1,944.5111

Among the tested combinations, C=1 and epsilon=50 produced the lowest reported validation MSE. This is a bounded classroom experiment, not evidence of external or production performance.

Source · Financial AI assignment 3 report · Iowa housing example

ANN and DNN validation behavior

Reported evidence

ANN

One hidden layer · comparison baseline

DNN

Three hidden layers · limited gain from added complexity

Source · Financial AI assignment 4 report

The question

What the analysis used

How the work progressed

Test model complexity

Examine representation and regularization

Tune support-vector models

Compare neural-network depth

Main insight

Decision value

Limitations and next checks

Evidence, with boundaries

Polynomial degree and validation RMSE

PCA explained variance

Linear SVR tuning result

ANN and DNN validation behavior

The question

What the analysis used

How the work progressed

Test model complexity

Examine representation and regularization

Tune support-vector models

Compare neural-network depth

Main insight

Decision value

Limitations and next checks

Evidence, with boundaries

Polynomial degree and validation RMSE

PCA explained variance

Linear SVR tuning result

ANN and DNN validation behavior