Similarity-based image search, also known as content-based image retrieval, has historically been a challenging computer vision task. This problem is especially difficult for visual art, because it is less obvious as to what a metric of “similarity” should be defined as and who should set that standard for art. For example, when I upload a photo of a wall mural … [Read more...] about Similarity-Based Image Search for Visual Art
CV Tutorials
Building A Face Recognition System Using Scikit Learn In Python
What’s face recognition? Face recognition is the task of comparing an unknown individual’s face to images in a database of stored records. The mapping could be one–to–one or one–to–many, depending on whether we are running face verification or face identification. In this tutorial, we are interested in building a facial identification system that will verify if … [Read more...] about Building A Face Recognition System Using Scikit Learn In Python
Self-Supervised Learning In Vision Transformers
Anyone who has ever approached the world of machine learning has certainly heard of supervised learning and unsupervised learning. These are in fact two important possible approaches to Machine Learning that have been widely used for years. Only recently, however, has there been an explosion of a new term, Self-Supervised Learning! But let’s get there step by step and look at … [Read more...] about Self-Supervised Learning In Vision Transformers
How To Automate 3D Point Cloud Segmentation And Clustering With Python
If you have worked with point clouds in the past (or, for this matter, with data), you know how important it is to find patterns between your observations 📈. Indeed, we often need to extract some higher-level knowledge that heavily relies on determining “objects” formed by data points that share a pattern. This is a task that is accomplished quite comfortably by our visual … [Read more...] about How To Automate 3D Point Cloud Segmentation And Clustering With Python
Vision Transformers or Convolutional Neural Networks? Both!
The field of Computer Vision has for years been dominated by Convolutional Neural Networks (CNNs). Through the use of filters, these networks are able to generate simplified versions of the input image by creating feature maps that highlight the most relevant parts. These features are then used by a multi-layer perceptron to perform the desired classification. But recently … [Read more...] about Vision Transformers or Convolutional Neural Networks? Both!
On DINO, Self-Distillation With No Labels
It has been clear for some time that the Transformers had arrived in the field of computer vision to amaze, but hardly anyone could have imagined such astonishing results from a Vision Transformer in such a short time since their first application. In this article, we discuss one of the most interesting advances in the field of computer vision, DINO, announced a few days … [Read more...] about On DINO, Self-Distillation With No Labels
Transformers in Computer Vision
Transformer architecture has achieved state-of-the-art results in many NLP (Natural Language Processing) tasks. One of the main breakthroughs with the Transformer model could be the powerful GPT-3 released in the middle of the year, which has been awarded Best Paper at NeurIPS2020. In Computer Vision, CNNs have become the dominant models for vision tasks … [Read more...] about Transformers in Computer Vision
Deep Transfer Learning for Image Classification
The following tutorial covers how to set up a state of the art deep learning model for image classification. The approach is based on the machine learning frameworks “Tensorflow” and “Keras”, and includes all the code needed to replicate the results in this tutorial. The prerequisites for setting up the model is access to labelled data, and as an example case I have used … [Read more...] about Deep Transfer Learning for Image Classification
Convolutional Neural Networks With Heterogeneous Metadata
In autonomous driving, convolutional neural networks are the go-to tool for various perception tasks. Although CNNs are great at distilling information from camera images (or a sequence of them in form of a video clip), I constantly bump into all kinds of metadata that do not lend themselves to convolutional neural networks. Metadata, by traditional definition, means a set … [Read more...] about Convolutional Neural Networks With Heterogeneous Metadata
Generating New Faces With Variational Autoencoders
Introduction Deep generative models are gaining tremendous popularity, both in the industry as well as academic research. The idea of a computer program generating new human faces or new animals can be quite exciting. Deep generative models take a slightly different approach compared to supervised learning which we shall discuss very soon. This tutorial covers the basics … [Read more...] about Generating New Faces With Variational Autoencoders