The IRMA Community
Newsletters
Research IRM
Click a keyword to search titles using our InfoSci-OnDemand powered search:
|
Data Science in the Database: Using SQL for Data Preparation
Abstract
An important part of the data lifecycle is preparing the data for analysis. This includes exploratory data analysis to find problems, data cleaning to solve problems related to 'dirty' data, and data transformation to achieve the format required by data analytics tools. Much data resides in databases, and it is beneficial for the data scientist to be able to carry out as much of this work as possible inside the database. This can be achieved using SQL. This article overviews some of the basic approaches to data exploration, cleaning and wrangling using SQL, including methods to deal with missing data, outliers, and bad values.
Related Content
Princy Pappachan, Sreerakuvandana, Mosiur Rahaman.
© 2024.
26 pages.
|
Winfred Yaokumah, Charity Y. M. Baidoo, Ebenezer Owusu.
© 2024.
23 pages.
|
Mario Casillo, Francesco Colace, Brij B. Gupta, Francesco Marongiu, Domenico Santaniello.
© 2024.
25 pages.
|
Suchismita Satapathy.
© 2024.
19 pages.
|
Xinyi Gao, Minh Nguyen, Wei Qi Yan.
© 2024.
13 pages.
|
Mario Casillo, Francesco Colace, Brij B. Gupta, Angelo Lorusso, Domenico Santaniello, Carmine Valentino.
© 2024.
30 pages.
|
Pratyay Das, Amit Kumar Shankar, Ahona Ghosh, Sriparna Saha.
© 2024.
32 pages.
|
|
|