Transfer Learning in natural language processing is an area that had not been explored with great success. But, last month (May 2018), Jeremy Howard and Sebastian Ruder came up with the paper - Universal Language Model Fine-tuning for Text Classification which explores the benefits of using a pre-trained model on text classification. It proposes ULMFiT, a transfer … Continue reading Understanding the Working of Universal Language Model Fine Tuning (ULMFiT)
Pivoting a table is a very common operation in data processing. But there is no direct function in BigQuery to perform such operation. To solve this problem I have written a Python module, BqPivot. It generates a SQL query to pivot a table that can then be run in BigQuery. In this blog post, I will … Continue reading How to pivot large tables in BigQuery?
I have studied Java at my high school. When I first started writing Python in my freshman year, I used to mentally translate Java to Python. But after some good amount of open source exposure, I figured that Python is way cleaner and idiomatic than Java. In this blog post, I discuss 6 things I wish … Continue reading 6 Things Every Beginner Should Know To Write Clean Python Code
Deep learning algorithms require a huge amount of training data. This makes us put more and more labeled data into our training set even if it does not belong to the same distribution of data we are actually interested in. For example, let's say we are building a cat classifier for door camera devices. We … Continue reading What to do when we have mismatched training and validation set?
Earlier, I was of the opinion that getting computers to recognize images requires - huge amount of data, carefully experimented neural network architectures and lots of coding. But, after taking the deep learning course - fast.ai, I found out that it is not always true. We can achieve a lot by writing just a few lines … Continue reading Hotdog or Not Hotdog – Image Classification in Python using fastai
Activation functions are an integral component in neural networks. There are a number of common activation functions. Due to which it often gets confusing as to which one is best suited for a particular task. In this blog post I will talk about, Why do we need activation functions in neural networks? Output layer activation … Continue reading Which activation function to use in neural networks?
As promised, this is the second post on my two part blog series on time series modelling and forecasting. In my first blog post I discussed the basics of time series analysis and gave a theoretical overview. In case you missed it you can find it here - Understanding Time Series Modelling and Forecasting, Part 1 … Continue reading Understanding Time Series Modelling and Forecasting, Part 2
Time series forecasting is extensively used in numerous practical fields such as business, economics, finance, science and engineering. The main aim of a time series analysis is to forecast future values of a variable using its past values. In this post, I will give you a detailed introduction to time series modelling. This would be the … Continue reading Understanding Time Series Modelling and Forecasting – Part 1
It has been more than 13 years since the last episode of Friends aired. But we never stop talking about it. Do we? I do not remember the last time I had a pizza without watching a random episode of Friends. Last night, I was watching one of my favorite episodes, "The One With Ross' … Continue reading Who was the lead character in Friends? The Data Science Answer
In this post, I will discuss a very common problem that we face when dealing with a machine learning task - How to handle categorical data especially when the entire dataset is too large to fit in memory? I will talk about how to represent categorical variables, the common problems we face while one hot … Continue reading How to One Hot Encode Categorical Variables of a Large Dataset in Python?