Training Program on Data Mining / Predictive Analytics

Technical Aspects of Data Mining / Predictive Analytics  

  1. How does one evaluate the quality of a predictive model? Training data vs. validation data. Concept of cross-validation.
  2. What is regression analysis?
  3. What is the taxonomy of regression models? Parametric vs. non-parametric, linear vs. non-linear, robust vs. non-robust regression.
  4. What is a generalization error? Overfitting vs. underfitting.
  5. What is the tradeoff between bias and variance?
  6. What is logistic regression?
  7. What is a confusion matrix? Misclassification error vs. misclassification cost.
  8. What is a lift chart? Lift chart vs. gain chart.
  9. What is multicollinearity?
  10. What is the difference between statistical regression and data mining regression?
  11. What is data mining jargon? Statistical jargon vs. data mining jargon.
  12. What transformations are useful for continuous variables?
  13. What transformations are useful for categorical variables?
  14. What is Cluster Analysis? Customer segmentation vs. cluster analysis.
  15. Supervised training vs. un-supervised training. Classification vs.cluster analysis.
  16. What is the relationship between Theory and Cluster Analysis?
  17. What is the boosting? Weak learner vs. strong learner?
  18. What is machine learning?
  19. What is a tree-based model? CART vs. CHAID.
  20. What is Random Forest ?
  21. What is TreeNet/TreeBoost?
  22. What are Multivariate Adaptive Regression Splines (MARS)?
  23. What is the Support Vector Machine (SVM)?
  24. What is Association/Market Basket Analysis?
  25. What is the Genetic Algorithm?
  26. What is a Neural Net? What is Neural Net architecture? What is the taxonomy of Neural Net models?
  27. What are data preparation headaches?
  28. What is the relationship between the number of variables and the number of unknown parameter estimates and number of observations?
  29. What are the major data mining tasks?
  30. Which data mining tool is the best for Risk /Customer/Sales/Marketing Analytics?
  31. What are the golden rules of data mining?
  32. What is credit scoring?
  33. What is a scorecard?