Building a fantasy football data newsletter

Fantasy football is massive. 7,429,528 players across the world are tinkering online to be the best fantasy football manager.

I’ve been for a few years now, and it appeals to my nerdy side of looking at scores, rankings, stats to get an edge over the competition.

This year, a group of 10 of us at work joined together to enter into the draft league format. In a nutshell, there only exists one of every player, once you’ve got Salah, he’s yours, no-one else’s, unless you trade.

Despite me draining hours each week catching up on the highlights, reading up the latest stats and scores, and debating with my colleagues on who to put on my bench, I’ve found myself yearning for more… it needs more charts…

charts

The Project

Being a data nerd, I had to get hold of more data, more insight, more ways of debating who is the better fantasy manager with my work colleagues.

So I decided to take it into my own hands using Python. To learn how to pull the Fantasy Premier League (FPL) data, to get it into a usable format, build some cool charts, and share it with people.

1) Pulling FPL Data

The first thing I needed to tackle was pulling data from the fantasy premier league website.

Luckily in programming, or with anything, there are people who have ventured down any given path before. So a bit of Googling and I found a really good overview to get me started. Credit to fantasyfutopia. This blog explained how to use the requests Python library to make a call to the FPL api, capture the json response and convert it into a pandas dataframe.

Discovering API calls

That was a good start, however I didn’t really understand how to discover what APIs I needed to call to get what I wanted. The guide on fantasyfutopia already knew what API they wanted to call, whereas I wanted to explore! A bit more Googling and I found this post on reddit which really opened my eyes! I’ve written out the steps from this post below. Try it out yourself!

  1. Go to the page from which you want to retrieve the details

  2. Press F12 (this will open developers window on the side)

  3. Go to Network Tab

  4. Refresh the page

  5. A bunch of stuff will load in the table under Network Tab

  6. Look for all items under column Type: "fetch"

  7. The corresponding items in the Name column would be the api links

Putting it into practice

fantast premier league api

Let’s walk through an example. I’ve gone to the draft fpl site, hit F12, refreshed and you can see a long list of “fetch” items.

bootstrap_static_results.png

If you click on one in particular, for example “bootstrap-static” like above, you can see that it will preview the json response data you would receive.

You can see that one of the sections of this is “elements” which essentially just has all 624 elements (or football players) in the game, with details about them like name, total_points, goals, assists and so on.

What I wanted to pull

With that in mind, I created a function to call the 4 different APIs I found useful for digging into analysis. These were:

  1. transactions - Transfers of players made by managers in the league

  2. elements - Details about all of the 600+ players in the game

  3. details - Data on standings, managers, gameweek matches

  4. element_status - Gives ownership status of players in the game, so you can determine which manager owns which players

get_json

2) Getting the data into a usable format

Once I had various data-sets saved locally as json files, I wanted to get them into a usable format. In Python this is generally going to be as a pandas dataframe.

So I created a function get_data() where I could specify a specific dataset I wanted, and it would pull the data from the relevant json file and return it in pandas dataframe form.

get_data.png

So if I wanted all the matches data, I’ll just call:

df = get_data('matches')
 

3) Exploring the data

Once the data was at my fingertips, now it was the fun stuff. Finally getting to explore the data and build some charts that show me unique insight!

The exploratory trap

To begin with I spent a lot of time jumping from dataset to dataset, not really building anything of true value, no finished product. Don’t get me wrong it was fun, but I started to realise I was going round in circles aimlessly.

I realised I needed to formulate a question. Something that I am genuinely interested in. Then I’d be motivated to find an answer to it with the data.

So I thought about what I wanted to produce: a game-week newsletter with cool charts that give some unique insights into the data.

I then brainstormed with a friend some topics to dig into, to get answers to. Here’s a taste of what we came up with:

  • Standings over time

  • Transfers

    • Activity/Volume

    • Points gain/loss from transfers

  • High scores

    • Individual football players

    • Position based (strongest defence, midfield, forward)

  • Substitutes

    • What if I hadn’t benched “x”

    • Most bench points

  • Records

    • Longest win streak

    • Losing streak

    • Gameweek highscores

  • Margins of victory/loss

The build-it-all trap

After coming up with this list, I fell into another trap of thinking I needed to attempt to build all of it before sharing it out. But after a week or two I realised how silly that was; to do all of the stuff on the list could take me a year! Not only would the football season would be over by that point (and half way into the next for that matter) but it is possible I could spend all that time on building a perfect newsletter, only to realise it to indifferent users / colleagues. Now of course, hobby projects aren’t built for the satisfaction of others, but my point is more general; better to release early and often and get vital feedback from others to learn how to improve.

It’s a mindset I’ve started to pick up from my new role in a Data Science Agile development team, which obviously is more focused on building products for other teams and customers, but the same thing applies for personal projects. If you want to get better faster, get it out there in front of others!

4) Creating the charts

I eventually decided to narrow down my first newsletter to have 3 charts.

  • Standings

  • Top scoring owned players

  • Streaks

Standings

A view of points (and in turn league standings) over time in game-weeks. I find it interesting to see groupings emerge - a bit like the big 6 in the Premier League, or like the peloton in cycling.

standings.png

Top Scoring Owned Players

The top 10 scoring players for the latest game-week, coloured by manager! (I had 2 of the top 10 ;P)

players.png

Streaks

After the last gameweek, how are manager’s win streaks looking? Dave is on a strong 6 week streak, John on the other hand is looking shaky! Ben and Cory obviously drew this week.

streaks.png

5) Pulling the Email together

Once I had the charts built, it was a matter of getting email working!

I built a separated gw_email.py script which pulls together the charts I have produced and emails them out to myself.

Eventually I intend to use as a push-button release and it will re-pull down all of the latest data, build the charts and then email all my colleagues the newsletter.

6) Future Plans

As mentioned earlier, I want to release early and release often.

It was great to send my first newsletter out last week and stir up a bit of healthy competition showing how big Dave’s current win streak is, showing the shaky ground John is on 4 losses in a row, or how bad I am at transfers, since I recently traded Pepe to Thomas and he was this game-week’s top scorer!

This experience has made me more motivated to continue adding to this project. To keep building more unique insights and charts.

I am also keen to learn how to better structure my Github Repo in a way that facilitates open source contribution, whether that’s from my colleagues, or other interested users out there. Stay tuned on this one because I have a lot to learn, and that means I’ll have a lot to share on my experience!

Previous
Previous

Jabra Elite 65t - Still Holding Their Own

Next
Next

2. First month lessons