One of the most common problems in Computer Vision is the lack of images when training ML models. In deep learning, a large amount of data is required to make neural networks to learn relevant characteristics of inputs and then to perform the inference process correctly; because when models are trained on limited samples they are not able to generalise to unseen data. Even if pre-trained models (transfer learning) are used, the images for the particular cases are often insufficient and the mo del is not trained correctly.

At Keepler we have faced this challenge in projects involving object detection in images, more specifically in anomaly detection projects. Given this situation we have seen the need to look for methods to generate synthetic images (data augmentation), with the aim of making viable projects with a reduced dataset of images. Specifically, we have researched two techniques: 

  • Generation of images using classical data augmentation procedures: distortions, rotations, colour change, etc. to the original images. 
  • Generation of images with GANs (Generative Adversarial Networks); specifically the use of Cycle GANs to make a context change (style transfer) to original images and generate new ones. 

The generation of images or any type of data is very common in a large number of projects where data is limited. Increasing variability of training data allows for greater generalization of models; it can also reduce the cost of data collection and labelling. 

Throughout the following white paper that you can download, we will see in detail the methods used, some simple and some more complex, to produce synthetic images needed in the training of computer vision models.

Download for free this white paper about Data Augmentation 👇

Title: How to use Data Augmentation when you have limited data
Authors: Ángela García, Data Scientist at Keepler & Adriana A. Bogdan, Data Scientist at Keepler

    Data protection: The data controller is Keepler Data Tech S.L. Your data is collected for the purpose of being able to respond to your requests for information, without disclosing your data to any third parties. You have the right to know what information we store about you, to correct it or erase it as explained in the Privacy Policy.


    • Adriana A. Bogdan

      Data Scientist at Keepler. 'I am a Data Scientist passionate about building models that fix problems and exploring data to draw meaningful conclusions. Being a part of new technologies and trying out innovative solutions is what motivates me the most. I love to learn about life through travelling, this way I can get to know different cultures, lifestyles and cuisines.'

    • Ángela García

      Data Scientist at Keepler. 'Passionate about technology and science, I love solving problems and facing new challenges. In my spare time I like to read, currently studying a bachelor in humanities and philosophy.'