1.
In previous question, suppose you have identified multi-collinear features. Which of the following action(s) would you perform next?
1. Remove both collinear variables.
2. Instead of removing both variables, we can remove only one variable.
3. Removing correlated variables might lead to loss of information. In order to retain those variables, we can use penalized regression models like ridge or lasso regression.
2.
Adding a non-important feature to a linear regression model may result in.
1. Increase in R-square
2. Decrease in R-square
3.
Suppose, you are given three variables X, Y and Z. The Pearson correlation coefficients for (X, Y), (Y, Z) and (X, Z) are C1, C2 & C3 respectively. Now, you have added 2 in all values of X (i.enew values become X+2), subtracted 2 from all values of Y (i.e. new values are Y-2) and Z remains the same. The new coefficients for (X,Y), (Y,Z) and (X,Z) are given by D1, D2 & D3 respectively. How do the values of D1, D2 & D3 relate to C1, C2 & C3?
4.
Imagine, you are solving a classification problems with highly imbalanced class. The majority class is observed 99% of times in the training data. Your model has 99% accuracy after taking the predictions on test data. Which of the following is true in such a case?
1. Accuracy metric is not a good idea for imbalanced class problems.
2. Accuracy metric is a good idea for imbalanced class problems.
3. Precision and recall metrics are good for imbalanced class problems.
4. Precision and recall metrics aren't good for imbalanced class problems.
5.
In ensemble learning, you aggregate the predictions for weak learners, so that an ensemble of these models will give a better prediction than prediction of individual models. Which of the following statements is / are true for weak learners used in ensemble model?
1. They don't usually overfit.
2. They have high bias, so they cannot solve complex learning problems
3. They usually overfit.
6.
Which of the following options is/are true for K-fold cross-validation?
1. Increase in K will result in higher time required to cross validate the result.
2. Higher values of K will result in higher confidence on the cross-validation result as compared to lower value of K.
3. If K=N, then it is called Leave one out cross validation, where N is the number of observations.
7.
Cross-validation is an important step in machine learning for hyper parameter tuning. Let's say you are tuning a hyper-parameter
8.
Cross-validation is an important step in machine learning for hyper parameter tuning. Let's say you are tuning a hyper-parameter
9.
10.
What would you do in PCA to get the same projection as SVD?