CRISP-DM: Metodologi Proyek Data Science

Authors

  • Afika Rianti Universitas Pendidikan Indonesia
  • Nuur Wachid Abdul Majid Universitas Pendidikan Indonesia
  • Ahmad Fauzi Universitas Pendidikan Indonesia

Keywords:

Literature review, data science, CRISP-DM

Abstract

This research is motivated by the development of data science which is increasingly popular. To support this development, especially in the field of research, methodologies are needed so the projects created can be well structured and organized. This research aims to discuss one of the most widely used data science methodologies, that’s cross industry standard process for data mining (CRISP-DM). This research used a type of literature review research with data collection technique used literature study. Based on the research that has been conducted, it can be concluded that CRISP-DM methodology can be used in the field of data science projects precisely in the areas of data mining, artificial intelligence, machine learning, deep learning, big data, data analysis, and data analytics. This methodology has six stages that each of these stages has different phases from one to another. For those phases, researchers can follow phases that have been commonly used in other studies or can adjust according to their respective projects. Then as a consideration in using CRISP-DM methodology, it can be considered based on the benefits and challenges of it. 

References

World Economic Forum. (2020, 20 Oktober). The Future of Jobs Report 2020 [Online]. Available: https://www. weforum.org/reports/the-future-of-jobs-report-2020/.

News18. (2022, 28 Desember). Data Science Most Preferred Online Course in 2022: Survey [Online]. Available: https://www.news18.com/news/education-career/data-science-most-preferred-online-course-in-2022-survey-6605215.html.

O. Korableva, T. Durand, O. Kalimullina, and I. Stepanova, “Studying User Satisfaction with the MOOC Platform Interfaces Using the Example of Coursera and Open Education Platforms,” in Proceedings of the 2019 International Conference on Big Data and Education, 2019. doi: https://doi.org/10.1145/3322134.3322139.

N. P. Volkova, N. O. Rizun, and M. V. Nehrey, “Data Science: Opportunities to Transform Education”, CTE Workshop Proc., vol. 6, pp. 48–73, Mar. 2019.

J. S. Saltz and N. Hotz, “Identifying the Most Common Frameworks Data Science Teams Use to Structure and Coordinate Their Projects,” in 2020 IEEE International Conference on Big Data (Big Data), 2020. doi: 10.1109/BigData50022.2020.9377813.

B. Christoffer and M. Lindovsky, “Machine Learning Project Management - A Study of Project Requirements and Processes in Early Adoption.” Master’s thesis, Dept. Architecture and Civil Engineering, Univ., Chalmers University of Technology, Gothenburg, Sweden, 2019.

C. Schröer, F. Kruse, and J. M. Gómez, “A Systematic Literature Review on Applying CRISP-DM Process Model,” Procedia Comput. Sci., vol. 181, pp. 526–534, 2021. doi: https://doi.org/10.1016/j.procs.2021.01.199.

F. Wulandari. (2020, Dec). “Pemanfaatan Lingkungan sebagai Sumber Belajar Anak Sekolah Dasar (Kajian Literatur)”. Journal of Educational Review and Research, vol. 3(2), pp. 105-110. Available: https://scholar.archive.org/work/vjcvozxdwfdg7ix4qz2shetmua/access/wayback/https://journal.stkipsingkawang.ac.id/index.php/JERR/article/download/2158/pdf.

P. M. Seeger, Z. Yahouni, and G. Alpan, “Literature Review on Using Data Mining in Production Planning and Scheduling Within the Context of Cyber Physical Systems,” J. Ind. Inf. Integr., vol. 28, no. 100371, 2022. doi: https://doi.org/10.1016/j.jii.2022. 100371.

I. H. Sarker, “Deep Learning: A Comprehensive Overview on Techniques, Taxonomy, Applications and Research Directions,” SN Comput. Sci., vol. 2, no. 6, pp. 420, 2021. doi: https://doi.org/10.1007/s42979-021- 00815-1.

M. Fitriani, “Implementasi Association Rule dengan Algoritma Apriori Pada Data Peminjaman Buku UPT Perpustakaan Universitas Lampung Menggunakan Metodologi CRISP-DM” Skripsi Sarjana, Fakultas Teknik, Universitas Lampung, Lampung, Indoensia, 2019.

Y. L. Lopez, D. Grimaldi, S. Garcia, J. Ordoez, C. Carrasco-Farre, and A. A. Aristizabal, “Artificial Intelligence Model to Predict the Virality of Press Articles,” in 2022 14th International Conference on Machine Learning and Computing (ICMLC), 2022. doi: https://doi.org/10.1145/3529836.3529953.

A. M. M. Fattah, A. Voutama, N. Haryana, dan N. Sulistiyowati, “Pengembangan Model Machine Learning Regresi sebagai Web Service untuk Prediksi Harga Pembelian Mobil dengan Metode CRISP-DM,” JURIKOM (Jurnal Riset Komputer), vol. 9, no. 5, p.1669-1678, 2022. doi: http://dx.doi.org/10.30865/jurikom.v9i5.5021.

F. Reinolds, C. Neto, and J. Machado, “Deep Learning for Activity Recognition Using Audio and Video,” Electronics (Basel), vol. 11, no. 5, p. 782, 2022. doi: https://doi.org/10.3390/electronics11050782.

J. Abasova, P. Tanuska, and S. Rydzi, “Big Data—Knowledge Discovery in Production Industry Data Storages—Implementation of Best Practices,” Appl. Sci. (Basel), vol. 11, no. 16, p. 7648, 2021. doi: https://doi.org/10.3390/app11167648.

E. Exenberger, & J. Bucko, “Analysis of Online Consumer Behavior-design of CRISP-DM Process Model.” AGRIS on-line Papers in Economics and Informatics, vol. 12, no. 3, pp. 13-22, 2020. doi: http://dx.doi.org/10.22004/ag.econ.320071.

Christian, Yefta, and Katherine Oktaviani Yap Rui Qi. "Penerapan K-Means pada Segmentasi Pasar untuk Riset Pemasaran pada Startup Early Stage dengan Menggunakan CRISP-DM," JURIKOM (Jurnal Riset Komputer), vol 9, no. 4, pp. 966-973, 2022. doi: http://dx.doi.org/10.30865/jurikom.v9i4.4486.

F. Martinez-Plumed et al., “CRISP-DM Twenty Years Later: From Data Mining Processes to Data Science Trajectories,” IEEE Trans. Knowl. Data Eng., vol. 33, no. 8, pp. 3048–3061, 2021. doi: https://doi.org/10.1109/TKDE.2019.2962680.

A. Zernig, A. Pandeshwar, R. Kern, and M. Rauch, "Machine Learning and Automated Decision Making," in SemI40 Project Prospective: Industry 4.0 Evolution Revolution, Austria. SemI40 Consortium, 2019, pp. 58–75. Accessed: Mar. 20, 2023. [Online]. Available: https://www. researchgate.net/publication/337592264. ._A_SemI40_Project_Prospective_-_Industry40_from_Evolution_to_Revolution

V. Plotnikova, M. Dumas, and F. Milani, “Adaptations of Data Mining Methodologies: A Systematic Literature Review,” PeerJ Comput. Sci., vol. 6, no. e267, p. e267, 2020. doi: https://doi.org/10.7717 /peerj-cs.267.

R. Wirth, & J. HippJ, “CRISP-DM: Towards a Standard Process Model for Data Mining,” In Proceedings of the 4th International Conference on the Practical Applications of Knowledge Discovery and Data Mining, vol. 1, pp. 29-39, 2000.

P. Chapman, J. Clinton, R. Kerber, T. Khabaza, T. Reinartz, C. Shearer, & R. Wirth, “CRISP-DM 1.0: Step-by-step Data Mining Guide,” SPSS, 2000.

Data Science Process Alliance. (2021). Evaluation Data Science [Online]. Available: https://www.datascience -pm.com/wp-content/ uploads/2021/08/CRISP-DM-for-Data-Science.pdf.

W. Y. Ayele, “Adapting CRISP-DM for Idea Mining: A Data Mining Process for Generating Ideas Using a Textual Dataset,” arXiv [cs.IR], 2021. doi: https://doi.org/10.48550/arXiv.2105.00574.

J. S. Saltz, “CRISP-DM for Data Science: Strengths, Weaknesses and Potential Next Steps,” in 2021 IEEE International Conference on Big Data (Big Data), 2021. doi: https://doi.org/10.1109/BigData 52589.2021 .9671634.

Published

2023-07-25