Recently, while implementing EfficientNet networks I came across a github comment detailing an implementation of Swish activation function that promises saving upto 30% GPU memory usage for EfficientNets. In this (short) blog post I will briefly go over the details of this implementation and explain what enables this implementation to save GPU memory.
Continue reading
In the previous post, I went through some areas where a Gaussian distribution could be useful. This post is going to be focused on implementation of Gaussians. Specifically, we will be implementing our first Gaussian, its discrete integral approximation and different comparison metrics that can be used to compare two distributions. I will be using Python’s NumPy library for all numerical operations in this post.
Continue reading
In my previous post, I introduced Gaussians and their existence in different datasets. This post is primarily aimed at a derivation that I think is trivial for understanding where Gaussians are used in machine learning. This proof can be found in any machine learning textbook. Most part of the derivation can be found in this StatsExchange post that I got help from.
Continue reading