Interview with Josefine - Data Scientist and moinworld mentor for SQL

October 14, 2020

Can you briefly introduce yourself and tell us something about your professional career?

I am currently working as a data scientist at the Otto Group and develop machine learning solutions for the group companies with a wide range of use cases. I started with a dual study of applied computer science in Stuttgart together with IBM. In the various practical assignments, I noticed that I particularly enjoy working with data. That’s why I decided to do a Master’s degree specifically in Data Science and went to a university in Lisbon to do it. I then joined the Otto Group via an internship and my master’s thesis and have now been developing Data Science solutions in the field of e-commerce for over 3 years.

How did you first come into contact with SQL?

Actually, I already learned it at grammar school. I was lucky to have computer science from the ninth grade onwards and to be able to learn a lot there. SQL in particular, and how to set up databases, we had in the upper school. But then it came up again in my studies and at work anyway.

Did it help you to have continuous computer science lessons from the ninth grade on?

Definitely! I would almost certainly not have chosen computer science as a course of study if I hadn’t had it as a school subject before, and I am very grateful that I had the opportunity to take computer science as a subject for my school-leaving exams. I think it’s a pity that in many federal states or at many schools, computer science is not an important subject and if there are perhaps computer science lessons, it usually has little to do with programming, but rather with the general use of computers.

How do you use SQL in your work?

SQL is a language with which data can be viewed in general, edited and of course created. It’s like a hammer in a toolbox - you definitely need it. SQL is not only a tool that a data scientist should have and someone who programs a lot, but it can help a lot of people. Even if you work in marketing, for example, and have contact with data there - more and more data is being collected. And the more data there is, the more it is not enough to work with Excel only, because it takes too long and Excel is simply not a good tool to handle a lot of data. From then on it makes sense to use databases and tables as well as SQL as a language to work with the data. In itself, it is also a language that is very easy to learn and quick to work with.

In which areas is SQL used?

Wherever a lot of data is collected, there are usually databases and then there is often no getting around SQL. There are also other types of databases, but SQL is a standard tool for relational databases. If you work in IT, you will have contact with SQL from time to time. But there are also many other areas where there is data that can be processed with SQL, for example, as I said, for analyses in marketing, but also in finance, controlling, sales or recruiting.

Why should you learn SQL?

One reason is that SQL can be learned extremely quickly and new insights can be gained from existing data in a very short time. It can be combined with other programming languages and is used, among other things, to prepare data for further processing, for example in Python. You should definitely learn SQL if you work a lot with Excel and Excel gets too complicated and doesn’t work the way you want it to. Even if you already use dashboards, there are often integrations where you can integrate SQL into dashboards or at least use SQL-like languages.

What tools, apart from SQL, do you use as a data scientist?

Python is definitely one of the tools we use all the time, and there are many packages in Python that were developed specifically for data science and that many data scientists use, for example scikit-learn or TensorFlow. Otherwise we work for example in the Google Cloud, a cloud platform where there are databases and other important IT infrastructure.

What do you particularly like about your work as a data scientist?

When you work as a data scientist, there is usually a problem that you have to solve. The way to solve it is completely open for the time being. Nobody tells me exactly what the solution has to look like, but I can be very creative and try out what might work. It is a challenge to first think about what data I need, how I have to prepare it and how I can gain even more insights from this data. Then I use my toolbox of machine learning models to develop a product that adds value and solves the problem. You seldom see anything colourful that is created, such as a website that you can practically touch, but the whole thing is, of course, based on numbers and therefore looks very dry at first. But what I find really exciting is that you have to put a lot of your own brain into solving the problem and not just translate a series of requirements into code.

And you?

If you like to join our next class on SQL you find the next dates here.