2. Respond to the assignment with answers to the following questions ...

   a) What do folks mean when they say a classification model exhibits
      "high bias" or "high variance"?

      ANSWER:
      A model that has "high bias" underfits the training and testing data,
      while a model that exhibits "high variance" overfits the training data
      and underfits the testing data.  For a high bias model, large changes
      in the input values may result in little to no change in the output.
      For a high variance model, small changes in the input values may
      result in large changes to the output.

   b) How are these two terms related?

      ANSWER:
      Model performance often varies as a function of model complexity.
      An overly simple (high bias) model will be unable to recongize
      differences between classes for either training or testing data,
      while an overly complex (high variance) model will overfit the
      training data (exhibiting good performance for the
      training data but poor performance for the testing data).
      Model complexity can be viewed as a knob, producing a high bias
      model at the overly simple end of the range and producing a
      high variance model at the overly complex end of the range.

   c) How do these terms relate to the choice of "k" for k-fold cross
      validation?

      ANSWER:
      A very small value of "k" (e.g. 2-fold cross validation) may
      produce high bias models, while a very large value of "k"
      (e.g. leave one out cross validation) may produce high variance
      models.  With 2-fold cross validation, the learning algorithm only
      get to see half of the available data.  With leave one out cross
      validation, the learning algorithm gets to see all but one observation.
      Using either a very small or very large value of "k" can significantly
      reduce the quality of the performance estimates.  For model selection,
      this means we may select a model that has poorer generalization performance.
      For model assessment, this means we may produce either an overly
      pessimistic (small k) or an overly optimistic (large k) estimate of
      generalization performance.