BalanceData.Rd
Clean text and build term matrix for bag of words model or TF DFI.
BalanceData(dataset)
dataset | unbalanced dataset, a dataframe : two column: first text reviews and second binary class, label: negative =0 and positive=1. |
---|
balanced_dataframe balanced dataframe containing two columns: review texts and binary class , label: negative =0 and positive=1.
A balanced dataframe
# NOT RUN { library("SentiAnalyzer") direction <- system.file(package = "SentiAnalyzer", "extdata/Imbalance_Restaurant_Reviews.tsv") imbalance_data<- read.delim(direction,quote='',stringsAsFactors = FALSE) BalanceData(imbalance_data) # }