Home > Archive > 2018 > Volume 8 Number 5 (Oct. 2018) >
IJMLC 2018 Vol.8(5): 423-427 ISSN: 2010-3700
DOI: 10.18178/ijmlc.2018.8.5.723

A Hybrid Active Learning and Progressive Sampling Algorithm

Amr ElRafey and Janusz Wojtusiak

Abstract—Sampling techniques for data mining applications can be broadly categorized into Random Sampling (RS), Active Learning (AL) and Progressive Sampling (PS). Progressive Sampling techniques grow an initial sample up to the point beyond which model accuracy no longer significantly improves. These methods have been shown to be computationally efficient. The sampling schedule to be used with progressive sampling techniques is still an ongoing issue of research due to the fact that available sampling schemes may either overshoot, resulting in a final sample which is larger than necessary, or they may grow the sample too slowly thus requiring many iterations of the algorithm before convergence is reached. We demonstrate how using Batch Mode Uncertainty Sampling from the domain of active learning, to progressively grow the sample, can significantly improve the performance of progressive sampling. Through a series of trials on both simulated and real data, we show that our proposed Progressive Batch Mode Uncertainty Sampling (PBMUS) algorithm converges with a comparable or smaller number of data points at higher accuracy and in some cases, less computational time.

Index Terms—Active learning, uncertainty sampling, progressive sampling, linear regression with local sampling, random sampling, sampling, machine learning.

The authors are with George Mason University, Fairfax, VA 22030, USA (e-mail: aelrafey@gmu.edu, jwojtusi@gmu.edu).


Cite: Amr ElRafey and Janusz Wojtusiak, "A Hybrid Active Learning and Progressive Sampling Algorithm," International Journal of Machine Learning and Computing vol. 8, no. 5, pp. 423-427, 2018.

General Information

  • E-ISSN: 2972-368X
  • Abbreviated Title: Int. J. Mach. Learn.
  • Frequency: Quaterly
  • DOI: 10.18178/IJML
  • Editor-in-Chief: Dr. Lin Huang
  • Executive Editor:  Ms. Cherry L. Chen
  • Abstracing/Indexing: Inspec (IET), Google Scholar, Crossref, ProQuest, Electronic Journals LibraryCNKI.
  • E-mail: ijml@ejournal.net

Article Metrics in Dimensions