Who was the lead character in Friends? The Data Science Answer

It has been more than 13 years since the last episode of Friends aired. But we never stop talking about it. Do we? I do not remember the last time I had a pizza without watching a random episode of Friends. Last night, I was watching one of my favorite episodes, "The One With Ross' … Continue reading Who was the lead character in Friends? The Data Science Answer

How to One Hot Encode Categorical Variables of a Large Dataset in Python?

In this post, I will discuss a very common problem that we face when dealing with a machine learning task - How to handle categorical data especially when the entire dataset is too large to fit in memory? I will talk about how to represent categorical variables, the common problems we face while one hot … Continue reading How to One Hot Encode Categorical Variables of a Large Dataset in Python?

Bootstrapping – A Powerful Resampling Method in Statistics

We are often interested in population parameters. For example, the mean salary of all adults in a country. But collecting data of the entire population is almost always infeasible. Therefore, we use samples of the population to get a point estimate of our parameter of interest. But, what is the 95% confidence interval of your … Continue reading Bootstrapping – A Powerful Resampling Method in Statistics