代写CSE416 Introduction to Machine Learning帮做R语言

2025.05.09 - 首页 >> Java编程

CSE416 Introduction to Machine Learning

Q1 Learning a Tree

1 Point

Select one option.

Consider the following dataset.

If we use the decision tree algorithm to learn a decision tree from this dataset, what feature would be used as the split for the root node?

h1(x)

h2(x)

h3(x)

Q2 Decision Boundaries

4 Points

Which of the following pictures show decision boundaries that are possible to represent with a decision tree only using the features Age and Income? The regions with the green background are predicted positive and the regions with the orange background are predicted negative.

Q2.1

1 Point

This decision boundary can be learnt by a decision tree classifier only using the features Age and Income.

This decision boundary cannot be learnt by a decision tree classifier only using the features Age and Income.

Q2.2

1 Point

This decision boundary can be learnt by a decision tree classifier only using the features Age and Income.

This decision boundary cannot be learnt by a decision tree classifier only using the features Age and Income.

Q2.3

1 Point

This decision boundary can be learnt by a decision tree classifier only using the features Age and Income.

This decision boundary cannot be learnt by a decision tree classifier only using the features Age and Income.

Q2.4

1 Point

This decision boundary can be learnt by a decision tree classifier only using the features Age and Income.

This decision boundary cannot be learnt by a decision tree classifier only using the features Age and Income.

Q3 Tree Depth

2 Points

Q3.1 Bias/Variance

1 Point

A smaller depth decision tree will have __ bias and __ variance than a deeper decision tree.

higher bias, higher variance

higher bias, lower variance

lower bias, higher variance

lower bias, lower variance

Q3.2 Comparing Trees

1 Point

If decision tree T1 has lower training error than decision tree T2, then T1 will always have better test error than T2.

True

False

Q4 Calculating Classification Error

1 Point

Provide a numeric answer.

Based on the implementation in lecture, compute the classification error of the following tree with two output classes safe and risky.

Please give your answer to two decimal places. Make sure to not start with a dot, such as .5 (instead, you would write 0.5).

Q5 Splitting on a numeric feature

1 Point

Provide a numeric answer.

In building a decision tree to classify whether a loan is risky or not, we choose to split on the feature Annual Income from 10 training examples.

Here are the values of this column for examples of each output class:

Risky: 10k, 15k, 40k, 100k

Safe: 20k, 61k, 82k, 89k, 95k, 96k

The below image shows another view of the same data:

What is the classification error of the best split?

Please give your answer to one decimal place. Make sure to not start with a dot, such as .5 (instead, you would write 0.5).

Q6 Comparing Ensembles

2 Points

Q6.1 Which Model?

1 Point

Select one option.

Which following choice describes an ensemble model where each of the models in the ensemble can easily be trained in parallel (i.e. in any order)?

Decision Tree

Random Forest

AdaBoost

Q6.2 More Trees?

1 Point

Consider the following claim. Select which ensemble models we discussed in class that the claim is generally true for.

Claim: This ensemble model needs to select the number of trees used in the ensemble as to avoid overfitting.

Random Forest

AdaBoost

Q7 AdaBoost

1 Point

Select one option.

Suppose we are running AdaBoost using decision tree stumps. At a particular iteration, the data points have weights according to the figure (Large points indicate heavier weights.)

Which of the following decision tree stumps is most likely to be fit in the next iteration?

Hint: Notice the labels on the decision boundary. It shows the predicted label for a side of the decision boundary under/to the right of the word "Predict"

Q8 ML Practitioner Scenarios

4 Points

Consider the below scenarios, and determine whether you would or would not recommend the suggested idea. In your answer, state if the suggestion is "Correct" or "Incorrect" and provide a justification as to why that suggestion would or wouldn't be a good idea.

Q8.1

2 Points

Pavan's computer has 8 cores in the CPU, and each core can be responsible for a parallel task. He plans to use all of them for training a random forest classifier. On each core, he makes an exact copy of the original training dataset and trains a decision tree on that copy.