The IRMA Community
Newsletters
Research IRM
Click a keyword to search titles using our InfoSci-OnDemand powered search:
|
Shedding Light on Dataset Influence for More Transparent Machine Learning
|
|
Author(s): Venkata Surendra Kumar Settibathini (Intellect Business Solutions, USA), Ankit Virmani (Google Inc., USA), Manoj Kuppam (Independent Researcher, USA), Nithya S. (Dhaanish Ahmed College of Engineering, India), S. Manikandan (Dhaanish Ahmed College of Engineering, India)and Elayaraja C. (Dhaanish Ahmed College of Engineering, India)
Copyright: 2024
Pages: 16
Source title:
Explainable AI Applications for Human Behavior Analysis
Source Author(s)/Editor(s): P. Paramasivan (Dhaanish Ahmed College of Engineering, India), S. Suman Rajest (Dhaanish Ahmed College of Engineering, India), Karthikeyan Chinnusamy (Veritas, USA), R. Regin (SRM Institute of Science and Technology, India)and Ferdin Joe John Joseph (Thai-Nichi Institute of Technology, Thailand)
DOI: 10.4018/979-8-3693-1355-8.ch003
Purchase
|
Abstract
From healthcare to banking, machine learning models are essential. However, their decision-making processes can be mysterious, challenging others who rely on their insights. The quality and kind of training and evaluation datasets determine these models' transparency and performance. This study examines how dataset factors affect machine learning model performance and interpretability. This study examines how data quality, biases, and volume affect model functionality across a variety of datasets. The authors find that dataset selection and treatment are crucial to transparent and accurate machine learning results. Accuracy, completeness, and relevance of data affect the model's learning and prediction abilities. Due to sampling practises or historical prejudices in data gathering, dataset biases can affect model predictions, resulting in unfair or unethical outcomes. Dataset size is also important, according to our findings. Larger datasets offer greater learning opportunities but might cause processing issues and overfitting. Smaller datasets may not capture real-world diversity, resulting in underfitting and poor generalisation. These views and advice are useful for practitioners. These include ways for pre-processing data to reduce bias, assuring data quality, and determining acceptable dataset sizes. Addressing these dataset-induced issues can improve machine learning model transparency and effectiveness, making them solid, ethical tools for many applications.
Related Content
|
Kula A. Francis, Kenny A. Hendrickson.
© 2026.
26 pages.
|
|
Summyr Burton, Savannah Baus, Stephen A. Murphy.
© 2026.
50 pages.
|
|
Kesley Richardson, Colby Cavanaugh.
© 2026.
30 pages.
|
|
Angela M. Hill, Kevin B. Sneed, Deborah Austin, Deanna B. Wathington, Hiram B. Green, Michael B. Morgan, Janet B. Roman, Feng B. Cheng, John E. Clark, Natasha Rubie, Kristy Andre, Thea Moore, Antionette Davis, Feng Cheng, Karia Doreen MacAulay, Maisha Standifer, Judette Louis, Joseph Diamond, Kyaien Conner, Victor Obi, Samantha Thompson.
© 2026.
22 pages.
|
|
Angela Stephanie Mazzetti, Anniken Grønstad, John Blenkinsopp.
© 2026.
32 pages.
|
|
Marie Grace Avelino Gomez, Kenith B Villaruel.
© 2026.
30 pages.
|
|
Carolyn Allen.
© 2026.
30 pages.
|
|
|