week9 lec 2#

Decision Tree#

Random Forest#

Regression Tree#

When to use RT - data’s distribution is wired.

rt

How to build RT - use mean value on node, evaluate with RSS i.e. We decide the splitting criteria based when RSS is minimum.

Question

When to stop splitting (to prevent overfitting)?

rt

Question

What if multiple features in regression tree?

rt

Question

Why regression tree are prone to overfitting?

Question

What are the negative results for overfitted model?


More questions#

Q

Suppose that we are not allowed to adjust the size of the training set (train_size). What else can we do to train a better random forest that might further reduce the test error? (5]