Useful R Packages That aligns with the CRISP DM Methodology

As we all know CRISP DM stands for Cross Industry Standard Process for Data Mining is a process model that outlines the most common approach to tackle data driven problems. Per the poll conducted by KDNuggets in 2014 this was and “is” one of the most popular and widest used methodology. This method of gleaning … More Useful R Packages That aligns with the CRISP DM Methodology

Simple Guide for Selecting Statistical Tests When Comparing Groups

Selecting the right statistical test can prove to be a daunting task for anyone. This infographic presents a step by step approach for the test selection process. This way of looking at various conditions to pick the appropriate tests will allow the audience to visualize and remember the process easily. However, it is also very … More Simple Guide for Selecting Statistical Tests When Comparing Groups

The Most Common Analytical and Statistical Mistakes

It is not only about understanding about statistics, it is also about implementing the correct statistical approach or method. In this brief article I will showcase some common statistical blunders that we generally make and how to avoid them. To make this information simple and consumable I have divided these errors into two parts: Data … More The Most Common Analytical and Statistical Mistakes

How to create a best-fitting regression model?

Best Subset Regression method can be used to create a best-fitting regression model. This technique of model building helps to identify which predictor (independent) variables should be included in a multiple regression model(MLR). This method comprises of scrutinizing all of the models created from all possible permutation combination of predictor variables. This technique uses the … More How to create a best-fitting regression model?

Tools and Application for Business Excellence: Logistic Regression

Overview While statistical techniques like regression, Analysis of Variance aka ANOVA are useful when a response variable (Y) is continuous. However, if the (Y) aka Key Performance Output Variable (KPOV) is discrete than these methods end up being redundant or futile. If the response variable is binary (discrete) and the input variable(s) is/are continuous than … More Tools and Application for Business Excellence: Logistic Regression

Using Naïve Bayes with Speech Analytics Output to predict First Contact Resolution

What is Naive Bayes? It is a classification methodology based on Bayes Theorem assuming independence among the predictors. In simple terms, a Naive Bayes classification assumes that presence of features are not correlated to the other features that are present in the data. Being easy to use and understand it can be used with large … More Using Naïve Bayes with Speech Analytics Output to predict First Contact Resolution