In this blog post, we have **important Machine Learning MCQ** questions. All these **basic ML MCQs** are provided with answers. In these **MCQs on Machine Learning**, topics like classification, clustering, supervised learning and others are covered.

The **Machine Learning MCQ questions and answers** are very useful for placements, college & university exams.

**More MCQs related to Machine Learning**

- Top MCQ on linear regression in Machine Learning
- MCQ on Clustering in Data Mining: Machine Learning
- What is Machine Learning? Basic terminologies in Machine Learning

## Machine Learning MCQs with answers

1. Which of the following is not a type of supervised learning?

A. Classification

B. Regression

** C. Clustering**

D. None of the above

2. As the amount of training data increases

A. Training error usually decreases and generalization error usually increases

B. Training error usually decreases and generalization error usually decreases

** C. Training error usually increases and generalization error usually decreases**

D. Training error usually increases and generalization error usually increases

3. Which of the following are not classification tasks ?

A. Find the gender of a person by analyzing his writing style

** B. Predict the price of a house based on floor area, number of rooms etc.**

C. Predict whether there will be abnormally heavy rainfall next year

D. Detect Pneumonia from Chest X-ray images

4. Occam’s razor is an example of:

** A. Inductive bias**

B. Preference bias

5. A feature F1 can take certain value: A, B, C, D, E, F and represents grade of students from a college. Which of the following statements is true in the following case?

A. Feature F1 is an example of a nominal variable.

** B. Feature F1 is an example of ordinal variables.**

C. It doesn’t belong to any of the above categories.

D. Both of these

6. Which of the following is a categorical feature?

A. Height of a person

B. Price of petroleum

** C. Mother tongue of a person**

D. Amount of rainfall in a day

7. Which of the following tasks is NOT a suitable machine learning task?

** A. Finding the shortest path between a pair of nodes in a graph**

B. Predicting if a stock price will rise or fall

C. Predicting the price of petroleum

D. Grouping mails as spams or non-spams

8. Which of the following is correct for reinforcement learning?

A. The algorithm plans a sequence of actions from the current state.

** B. The algorithm plans one action at each time step.**

C. The training instances contain examples of states and best actions of the states.

D. The algorithm groups unseen data based on similarity.

9. What is the use of Validation dataset in Machine Learning?

A. To train the machine learning model.

B. To evaluate the performance of the machine learning model

** C. To tune the hyperparameters of the machine learning model**

D. None of the above.

10. Identify whether the following statement is true or false?

“Overfitting is more likely when the set of training data is small”

** A. True**

B. False

**More Machine Learning MCQ**

11. Which of the following criteria is typically used for optimizing in linear regression.

A. Maximize the number of points it touches.

B. Minimize the number of points it touches.

** C. Minimize the squared distance from the points.**

D. Minimize the maximum distance of a point from a line.

12. Which of the following is false?

A. Bias is the true error of the best classifier in the concept class

B. Bias is high if the concept class cannot model the true data distribution well

** C. High bias leads to overfitting**

D. For high bias both train and test error will be high

13. Decision trees can be used for the following type of datasets:

I. The attributes are categorical

II. The attributes are numeric valued and continuous

III. The attributes are discrete valued numbers

A. In case I only

B. In case II only

C. In cases II and III only

** D. In cases I, II and III**

14. What is true for Stochastic Gradient Descent?

A. In every iteration, model parameters are updated for multiple training samples

** B. In every iteration, model parameters are updated for one training sample**

C. In every iteration, model parameters are updated for all training samples

D. None of the above

15. Imagine you are dealing with 15 class classification problem. What is the maximum number of discriminant vectors that can be produced by LDA?

A. 20

** B. 14**

C. 21

D. 10

16. ‘People who bought this, also bought…’ recommendations seen on amazon is a result of which algorithm?

A. User based Collaborative filtering

B. Content based filtering

** C. Item based Collaborative filtering**

D. None of the above

2 points

17. Which of the following is/are true about PCA?

1. PCA is a supervised method

2. It identifies the directions that data have the largest variance

3. Maximum number of principal components <= number of features

4. All principal components are orthogonal to each other

A. Only 2

B. 1, 3 and 4

C. 1, 2 and 3

** D. 2, 3 and 4**

18. When there is noise in data, which of the following options would improve the performance of the KNN algorithm?

** A. Increase the value of k**

B. Decrease the value of k

C. Changing value of k will not change the effect of the noise

D. None of these

19. Which of the following statements is True about the KNN algorithm?

** A. KNN algorithm does more computation on test time rather than train time.**

B. KNN algorithm does lesser computation on test time rather than train time.

C. KNN algorithm does an equal amount of computation on test time and train time.

D. None of these.

20. A spam filtering system has a probability of 0.95 to correctly classify a mail as spam and 0.10 probability of giving false positives. It is estimated that 1% of the mails are actual spam mails. Suppose that the system is now given a new mail to be classified as spam/ not-spam, what is the probability that the mail will be classified as spam?

A. 0.89575

B. 0.10425

** C. 0.1085**

D. 0.0995

21. Bag I contains 4 white and 6 black balls while another Bag II contains 4 white and 3 black balls. One ball is drawn at random from one of the bags and it is found to be black. Find the probability that it was drawn from Bag I.

A. 1/2

B. 2/3

** C. 7/12**

D. 9/23

22. In a Bayesian network a node with only outgoing edge(s) represents

** A. a variable conditionally independent of the other variables.**

B. a variable dependent on its siblings.

C. a variable whose dependency is uncertain.

D. None of the above.

24. I. Logistic Regression is used for regression purposes.

II. Logistic Regression is used for classification purposes.

A) Only I is Correct

** B) Only II is Correct**

C) Both I and II are Correct

D) Both I and II are Incorrect

25. Which of the following methods do we use to best fit the data in Logistic Regression?

A) Least Square Error

** B) Maximum Likelihood**

C) Jaccard distance

D) Both A and B

State whether True or False.

26. After training an SVM, we can discard all examples which are not support vectors and can still classify new examples.

** A) TRUE**

B) FALSE

27. Suppose you are dealing with 3 class classification problem and you want to train a SVM model on the data for that you are using One-vs-all method.

How many times we need to train our SVM model in such case?

A) 1

B) 2

** C) 3**

D) 4

28. What is/are true about kernel in SVM?

1. Kernel function map low dimensional data to high dimensional space

2. It’s a similarity function

A) 1

B) 2

**C) 1 and 2**

D) None of these.

29. Suppose you are using RBF kernel in SVM with high Gamma value. What doesthissignify?

A) The model would consider even far away points from hyperplane for modelling.

** B) The model would consider only the points close to the hyperplane for modelling.**

C) The model would not be affected by distance of points from hyperplane for modelling.

D) None of the above

30. In training a neural network, we notice that the loss does not increase in the first few starting epochs: What is the reason for this?

A) The learning Rate is low.

B) Regularization Parameter is High.

C) Stuck at the Local Minima.

** D) All of these could be the reason.**

31. What is the sequence of the following tasks in a perceptron?

I) Initialize the weights of the perceptron randomly.

II) Go to the next batch of data set.

III) If the prediction does not match the output, change the weights.

IV) For a sample input, compute an output.

A) I, II, III, IV

B) IV, III, II, I

C) III, I, II, IV

**D) I, IV, III, II**

32. Which of the following is true about model capacity (where model capacity means the ability of neural network to approximate complex functions)?

** A) As number of hidden layers increase, model capacity increases**

B) As dropout ratio increases, model capacity increases

C) As learning rate increases, model capacity increases

D) None of these

33. Which of the following is true?

Single layer associative neural networks do not have the ability to

I) Perform pattern recognition

II) Find the parity of a picture

III) Determine whether two or more shapes in a picture are connected or not

**A) II and III are true**

B) II is true

C) All of the above

D) None of the above

34. The network that involves backward links from outputs to the inputs and hidden layers is called as

A) Self-organizing Maps

B) Perceptron

** C) Recurrent Neural Networks**

D) Multi-Layered Perceptron

No, the answer is incorrect.

35. Which of the following option is / are correct regarding the benefits of ensemble model?

1. Better performance

2. More generalized model

3. Better interpretability

A) 1 and 3

B) 2 and 3

** C) 1 and 2**

D) 1, 2 and 3

36. Ensembles will yield bad results when there is a significant diversity among the models. Write True or False.

A) True

** B) False**

37. Which of the following algorithms are not an ensemble learning algorithm?

A) Random Forest

B) Adaboost

C) Gradient Boosting

** D) Decision Tress**

38. For two runs of K-Mean clustering is it expected to get same clustering results?

A) Yes

** B) No**

40. Which of the following can act as possible termination conditions in K-Means?

I. For a fixed number of iterations.

II. Assignment of observations to clusters does not change between iterations. Except for cases with a bad local minimum.

III. Centroids do not change between successive iterations.

IV. Terminate when RSS falls below a threshold

A) I, III and IV

B) I, II and III

C) I, II and IV

** D) All of the above**

**Thanks for reading Machine Learning MCQ on our website.**