Task8
(a) Compare and contrast stepwise selection with shrinkage methods.
Similarities
- both avoid overfitting to the data, especially when the number of observations is small compared to the number of predictors.
- both can be used for variable selection to reduce model complexity.
Differences
- Stepwise selection takes iterative steps, until there is no improvements as measured by AIC.
- Shrinkage methods can reduce the size of coefficients without entirely eliminating variables.
(b) Explain why variables are standardized as part of the lasso model fitting procedure.
Variables that are on a larger scale typically have smaller coefficients and vice-versa. Without standardizing, the regularization will focus on shrinking the variables on a smaller scale over those on a larger scale.
(c) Describe the process of searching for the optimal value of the hyperparameter lambda in a lasso regression.
The optimal value for lambda can be found using cross-validation. First, a grid of lambda values is chosen for the search. Then for each lambda value, a cross-validation error is calculated.
The first step in calculating a cross-validation error is to partition the data into k folds. A single fold is removed for testing, and the remaining folds are used to train a lasso model with the current lambda value. This process is prepeated for each of the k partition, and a cross-validation error is calculated as the average of an error measure (e.g. RMSE or AUC) across all k testing partitions.
The optimal lambda value is the one with the lowest cross-validation error.
https://www.youtube.com/watch?v=fSytzGwwBVw
cross validation statQuest에도 있는데 이주제는 ISLR책이 더 잘 이해되는거 같음.
(f) confusion matrix
| pred\ref | negative | positive |
| negative | TN | FN |
| positive | FP | TP |
sensitivity = TP/(TP+FN)
specificity = TN/(TN+FP)
https://www.youtube.com/watch?v=vP06aMoz4v8
텍스트로 볼때는 와닿지 않았는데 영상으로 보니 언제 sensitivity나 specificity가 높은걸 써야하는지 알수 있었음. 3개, 4개의 경우도.
(g) lowering the cutoff threshold?
Assess the consequences of this recommendation as it relates to the business problem.
This will increase positive predictions (both TP and FP) while reducing negative predictions (both TN and FN), increasing sensitivity.
Task9
(a) Describe how baagging is used in the random forest algorithm and the advantage it gives random forests over a single decision tree in terms of the bias/variance trade-off
Random forests are created by applying bagging and taking random feature subsets to construct multiple trees, which are averaged to produce a prediction.
Bagging is the process of training of multiple models in parallel on different random subsets of the data. Each individual tree is trained on a different training dataset. Variance refers to the sensitivity of the model to changes in the training dataset. Bagging reduces variance because each individual tree is trained on different data.
'SOA > PA' 카테고리의 다른 글
| SOA/ASA/PA 기출 및 내용정리 - 22.10.11시험 Task6~12(기록용) (0) | 2024.04.05 |
|---|---|
| SOA/ASA/PA 기출 및 내용정리 - 22.10.11시험 Task1~5(기록용) (0) | 2024.04.05 |
| SOA/ASA/PA 기출 및 내용정리 - 23.04시험 Task1~4(기록용) (0) | 2024.03.31 |
| SOA/ASA/PA 기출 및 내용정리 - 23.10시험 Task8~(기록용) (0) | 2024.03.31 |
| SOA/ASA/PA 기출 및 내용정리 - 23.10시험 Task3~4(기록용) (1) | 2024.03.31 |