16. Juli 2019

From Robustness in Deep Learning Vision Models to the Ivory Tower of Language Models

Summary of the PyData Hamburg Spring 2019 Meetup

This is a wrap up of the PyData Hamburg Spring Edition 2019. Irina Vidal Migallón, Matti Lyra and Christopher Werner presented insights of their latest work in three different fields, namely "Deep Learning Vision", NLP and "software test failure prediction".

Stop Sign with Love and Hate Stickers to test a Deep Learning Vision Model

Irina Vidal Migallón

Irina Vidal Migallón is a former electrical engineer specialized in ML and Vision over the years. She has recently joined the AI team at Siemens Mobility. She showed us how to Poke Holes into our Deep Learning Vision Model:

Today the application of deep learning in vision has matured and is widely used in multiple industries. The use in critical environments such as medical applications or autonomous driving requires robust models. Robustness in the first place means evaluation and debugging. You make sure that the results look reasonable and you have optimized your model with respect to the relevant performance indicators.
But when you think you are done, once your precision and recall look good, you might be terribly mistaken! What you actually know is that your model now works on the given data set. But it might do it for the wrong reasons. Models are a representation of the world and you never know for sure from which "frog perspective" a neural net is looking at „reality“. When you train a model to differentiate between husky dogs and wolves and all wolves live in snowy environments you have just build a „snow detector“. Therefore the interpretability of the model and its results is essential. You should be able to explain the model to the stakeholders and yourself.

Adversarial examples are entities of your data set that make your model fail, whether it’s intentional or not. An example: You augment your data set by changing the lighting conditions or perspectives and find images where your model doesn't work great any more. The pro tip here is to label the failed entities of your data set and feed them back into the training set.

Intentional adversarial examples are “intrusions” into your AI: The following shows a picture of a panda as well as subtle noise that was generated based on the model parameters and the desired output label. The model sees a gibbon where a human would still clearly classify it as panda:

a picture of a panda which is interpreted as a picture of a gibbon by an AI model

Please find below Irina‘s slides with the references to the underlying studies and tools and more fun to read papers.

Intro slide of the talk poking holes in your deep learning vision model by Irina Vidal

If you have the opportunity to attend one of Irina's talks, do it! Her presentations are boiled down to the essential, vivid and entertaining and last but not least: She actually has something to say.

Matti Lyra

Matti Lyra holds a PhD in NLP. After leaving academia he joined Comtravo and is now working as a research engineer on an end to end automation pipeline for travel bookings.

The idea of language models has created a lot of buzz in the last 12 months, culminating in the decision of OpenAi not to publish a certain model because it was too „good“. Good in the sense that its texts can no longer be differentiated from human-written ones. This implies the fear of misuse for „deep fake news“.
The NLP community realized that language models can be used as a basis for training of all kinds of other models for all kind of other tasks in NLP. Matti gives an overview about the research of the last year and tries to give an intuition for what these language models are and how they are deployed into production:

Christopher Werner

The third talk by Christopher Werner is about Prediction of Status Changes in Software Tests. Nowadays test driven development has become the standard. As Christopher pointed out there are software systems which take a whole week to process all tests. In these cases it is very beneficial to train a model that is able to predict the sanity state of the system before running all tests. Christopher has recently finished his master thesis on the topic. See his presentation for more details and results:

Finally a thank you to the speakers for the inspiring presentations, the PyData Hamburg crew for organizing yet another great event and G+J for hosting and support!

Author: Philipp Pahl