This project is a comprehensive web application built with R Shiny. It streamlines the data workflow by integrating data upload, cleaning & preprocessing, feature engineering, exploratory data analysis (EDA), and downloading functions into one interactive platform. Users can either upload local files or use built-in R datasets.
- Programming Language: R
- URL:Data Explorer & Comparison Tool
- Core Shiny App: data_explorer_app_final.R
- Clone or download this repository to your local machine.
- Install the required packages listed above.
- Open the data_explorer_app_final.R file in RStudio (or any R environment).
- Click Run App. A Shiny app will open in your browser.
- Use the tabs to upload, clean/preprocess, engineer features, explore, and download your processed data.
- Select Reset App when you need to start fresh.
-
User Guide
- Introductory instructions on navigating each tab in the Shiny app.
-
Data Upload
- Upload files (
.csv,.xlsx,.json,.rds) or select built-in datasets (e.g.,mtcars). - Preview loaded datasets.
- Upload files (
-
Data Cleaning & Preprocessing
- Remove duplicates, handle missing values (remove or impute), and detect outliers (Z-score/IQR).
- Normalize/standardize numeric columns, encode categorical variables, and convert numeric-like data.
- Optionally remove selected columns and round numeric values.
-
Feature Engineering
- Apply transformations (log, square root, square, Box-Cox, power).
- Create additional features (time-based, text-based, statistical summaries).
- Conduct feature selection (variance threshold, correlation, Lasso) and dimensionality reduction (PCA, LDA).
- Remove or keep specific columns post-transformation.
-
Exploratory Data Analysis (EDA)
- Generate histograms, bar charts, boxplots, scatter plots, and correlation heatmaps.
- Access interactive plots (via plotly) and a statistical summary of numeric columns.
-
Download & Reset
- Download the processed dataset as
.xlsx,.csv,.json, or.rds. - Reset the application to clear all uploaded and processed data.
- Download the processed dataset as
Required R Packages (install them before running the app):
- shiny
- shinyjs
- shinythemes
- DT
- dplyr
- openxlsx
- stringr
- readr
- readxl
- jsonlite
- ggplot2
- plotly
- lubridate
- ggcorrplot
- recipes
- caret
- reshape2
- MASS
- rmarkdown
- DBI
- shinyWidgets
(For KNN imputation, install DMwR; for Lasso selection, install glmnet.)
- Wenbo Liu ( Wenbo0528 )
- Fatih Euysal ( fatihEuysal )
- Xiaoying Wang ( XiaoyingWang412 )
- Julieta Caroppo ( julieta87 )
This project is for educational purposes and is licensed under the MIT License.