One of the sectors that is helping to spread the importance and practicality of the use of data is the sports industry. Surely, in one way or another, you have come across a data visualization in a sports broadcast that has caught your attention because of how disruptive it is, such as the calculation and visualization in real time of the state of the tires or the ‘G’ force in Formula 1.

Precisely Formula 1 is a good example of innovation in the use and processing of data. To achieve their goals, they carried out a partnership with Amazon Web Services in 2018 to be able to use all the machine learning capabilities that the cloud platform offered them, working together the data scientists of both corporations. This led to innovations such as the ‘Overtaking Probability’ or ‘Advantage after Pit Stop’ graphs that we have been able to see during the races.


Photo: Amazon Web Services Youtube account

From the sensor to your TV in 500 milliseconds

Each F1 car has 300 sensors that generate 1.1 million telemetry data points per second, a huge amount of information for which they decided to set up an AWS service model to digest all this data and turn it into real-time information for both the teams and the fans.


Following this scheme, the data generated during the race first passes through the F1 infrastructure, which executes an HTTP call to send the information to the Amazon cloud using Amazon API Gateaway as an entry point. This API service is hosted as a function in AWS Lambda from which the race logic is implemented.

Once the incoming message is received, it updates the race status stored in Amazon DynamoDB (which acts as a NoSQL database service) and, if it is a trigger for a prediction, uses the model trained on Amazon SageMaker to make and return this prediction as a response to the initial call.

Finally, this response passes back through the F1 infrastructure to the transmission center, where it can be used by the technical team and the race director. This complete cycle from sensor activation to response at the destination is completed in less than 500 milliseconds.

One of the reasons why these optimized times are achieved is by preloading models trained with historical race data into an application hosted on AWS Lambda. In this way the model is loaded into memory along with the running code, and in turn is trained using the open-source XGBoost algorithm in Amazon SageMaker, which allows to reduce the times to the maximum.

They are also currently working on the redesign of the cars for 2022, with a project that includes the use of Amazon EC2 and the new Graviton2 processors, optimized for c6g instances that improve the performance of workloads.

Sport, technology and future

This case we have just seen is just a representative example of all those innovations in the use and processing of data that are taking place in the world of sport. Formula 1 has always been one of the most technologically advanced, but we are increasingly seeing more and more sports like soccer, which is more traditional and reluctant to innovations, but which is nevertheless advancing by leaps and bounds in recent years. Image recognition for player positioning or the use of sensorized clothing is already part of the routine of any high-level soccer team, and the needs in this regard are growing (astrophysical data analysts at Manchester City!). This is evidence that technology and sport are closer than ever and everything seems to indicate that we will see important innovations in the coming years.

Image: Unsplash | @abedismail


  • Data Analyst in Keepler: “Problem solver, especially ones related to data and technology. Always focused on improving, being updated and sharing what I've learned. If you love to analyze sports based on data, we will be great friends :)”