The day in the life of a Data Scientist
The internet would have you believe that Data Science is simultaneously the sexiest job of the 21st century and the role most likely to become obsolete. Confused? Yeah, so am I.
This doesn’t bother me one bit. As Data Scientists we like probabilities and we aren’t certain about anything. This means we are not very good at explaining what we actually do. What I can do is tell you what it’s like to be a Data Scientist at KrakenFlex
The Best Part
The best part of being a Data Scientist at KrakenFlex are the problems that we get to work on. They’re very complex problems that no one has solved before and they contribute directly towards preventing climate change.
I am part of a team that is currently developing the software and algorithms that control electric car charging overnight in order to use the greenest and cheapest electricity. (https://octopus.energy/intelligent-octopus/ ) This involves prediction (how many cars will need charging and how much energy do they need?) and optimisation (when should we charge them?). Both are super challenging data science problems and new problems for the energy industry. Electrification in transport is a trend that brings a totally different electricity demand than we’ve experienced before. What’s most exciting is that this demand can be flexible and at KrakenFlex, we live for flexibility!
What is flexibility? Think about someone plugging in their electric car at 6pm when they get home from work. At 6pm the electricity grid is under pressure with lots of people using electricity as everyone gets home from work and cooks dinner. That person's car most likely doesn’t need to be charged straight away. Instead we can wait till the middle of the night when demand is much lower, whilst everyone is asleep. That’s what I mean by flexibility. The ability to turn things on and off at different points in the day. The reason this is going to become more important is because of how many electric cars we will soon have on our roads, plus the electrification of heat as we move away from traditional gas powered boilers.
By moving demand from the peak (between 4-7pm) we can help to reduce the need for instances where fossil fuel power plants need to be powered up to meet demand. A few hundred thousand cars charging at home, at the same time will require the power equivalent to a large power station.
Our other Data Scientists at KrakenFlex are also working on cool, complex problems - including, how to optimise grid scale batteries; researching how a local energy market might work; or building new functionality into our optimisation tooling.
The Bizarre Part
Team names. Data Scientists are spread throughout KrakenFlex, in cross functional teams. I am part of the “Funky Gibbons,” which is home to 3 Software Engineers, 2 Data Scientists, and our product maestro. We also have a Data Science guild where all the Data Scientists come together to share knowledge or help each other out. Working in this way provides so many great opportunities to learn. Data Scientists at KrakenFlex are encouraged to code like a dev and are helped to get there, through the abundance of pair programming opportunities (where you team up and code with someone else). Not to be outdone on the bizarre part, the Data Science team? has its own logo, which I am yet to understand, but involves a unicorn. We can’t be accused of taking ourselves too seriously!
Whilst we work in the domain of energy flexibility, we also work flexibly between home and the office in a combination that suits.
The Technical Part
If I was to try and break down a typical workflow for a Data Scientist at KrakenFlex it would look something like this:
Step 1 - Discovery
This is where we bottom out the details of the problem we’re trying to solve. We use Miro, which is like an online whiteboard, and we ask lots of questions to tease out what we're actually trying to do. We tend to do this as a team which makes it interactive as well as conceptually interesting. It is also important to understand why we’re solving the problem (in case it doesn’t need solving at all).
Step 2 - Proof of Concept
The next stage of the process for the Data Scientists will likely involve using a Jupyter Notebook to test out the ideas suggested in discovery and start building out a proof of concept. This is where we can really start to dig into the details of how an approach or algorithm works and whether the results are likely to meet the requirements we need. This stage is often very iterative and we will regularly give results to the product team or customer for feedback. We can also make use of components of our optimisation framework that we have been building out since the company was founded. There aren’t rules here, we’re just looking for the best algorithm or approach to solve the problem.
Step 3 - Build
The third stage is likely to involve building out the best version from the proof of concept work into the main code base. This is where we will start working closely with the software engineers again to build out the interactions, the data feeds and tests that will reliably catch any bugs in our implementation. This work will no longer be within notebooks but will still be based in Python.
Step 4 - Monitoring
Once things are in production we can’t just down tools and relax. We need to think about how well our solution is performing (how much carbon and money we saved) and monitor this over time. Alongside our software developers we also look at how best to store and query our data so that other colleagues can make use of it. For example with our vehicle charging optimization, a customer may reach out with an issue with their car's charge overnight. It’s important that the customer support team can easily access any data to help them explain to the customer what went wrong and provide a good customer experience.
The Fun Part
Funky Gibbons have a soft spot for online board games (nothing like trying to play pictionary with a touchpad). We also regularly complete retros where as a team we celebrate what went well and propose ways we can improve based on the last sprint. Every project is technically different and we’re constantly making tweaks to the way we work too. Or as was the case last week, we go on a social to see who is secretly a pro at darts (naming no names).
Do I have any advice?
If you’re looking to get into data science my advice is always really simple. Find a problem you find interesting and start trying to solve it with data. Start with python and start with a Jupyter Notebook. I think working on your own problem is much more motivating and more realistic of the challenges you’ll face in reality. For example I love energy and sport so I would be thinking about things like:
Using your own smart meter data to forecast your electricity use.
Using exercise data from a smartwatch to analyse your heart rate and training performance.
There are so many different things you can do. Alternatively you can find problems that are more structured and already set out in formal courses. These are great, but remember part of the job of the Data Scientist is understanding the problem that needs solving. I encourage you not to be put off by what might seem like a mountain of knowledge you need to acquire. You’ll be amazed at how quickly you can start to contribute.
You might have noticed that I haven’t mentioned machine learning or statistics or AI, and sure, these are all really relevant terms to the world of a Data Scientist at Krakenflex, but there’s so much more to our day than these buzzwords might give us credit. Whether you believe the internet’s view on Data Science or not, I hope sharing a little bit about a day in the life of a Data Scientist at KrakenFlex has been helpful. There’s a tidal wave of data science problems that need solving to get the world to net zero and if this sounds exciting to you, you won't be disappointed.