To identify the talents of tomorrow, Team Arkéa Samsic came up with the idea of creating an innovative program using data collected by numerous amateur runners during their training sessions. To exploit this data effectively, Arkéa Samsic relied on AWS managed services.
Connected watch on the wrist or smartphone in the pocket, many amateur or professional athletes frequently record data from their training sessions and then analyze them using an application.
In the field of cycling, specific connected devices for bicycles enable a more in-depth study of effort and physical capacities. These are power sensors positioned in the drive train (crankset, hub, pedals) that measure speed, power, cadence, torque, heart rate and altitude.
Until recently, sensor data from “official” runners was collected via connected watches, but today this data feeds directly into applications such as Training Peaks, to monitor performance and draw up training programs based on goals to be achieved. The increasingly widespread use of this type of application results in a considerable mass of data from professional runners, but also from a multitude of anonymous runners of all ages and nationalities, among whom lie some real talents.
Data: detecting talent in a mass of data
In modern cycling, the level is so high that everything counts if you want to perform. So it’s essential to identify athletes who already have very high potential.
Team Arkéa Samsic came up with the idea of setting up a program to detect such talent by analyzing the data regularly recorded by the riders. The only requirements for taking part in this program are to have a Training Peaks account and to be equipped with a power meter.
A total of 650 candidates, of 38 different nationalities, registered online and uploaded all the data measured during their training sessions or competitions between April and July 2020. All that remained was to exploit this wealth of information to identify the most promising talents, for example by measuring a runner’s best average power output over 20 minutes (the Tired 20) after an effort corresponding to an expenditure of 3000 kCal.
It was also useful to be able to cross-reference these data with the public site procyclingstats. To achieve this, Team Arkéa Samsic turned to managed solutions from Amazon Web Services, thus ensuring that it had the appropriate IT resources at its disposal, while avoiding heavy investment.
Arkéa Samsic and AWS join forces in the quest for performance and talent
The technology, however complex, needs to be forgotten, so that the focus can be on data analysis. That’s the whole point of the cloud and managed services.
Team Arkéa Samsic, using AWS services and with the support of TeamWork, has created a specific platform for the talent detection program. The simplicity of these tools enabled Arkéa Samsic to be autonomous in the integration and architecture of a datalake, relying on the Amazon S3 service and the Amazon DynamoDB database.
Data from the datalake, fed via Amazon Kinesis, was then extracted with “Glue Job” for analysis. Machine Learning models could be deployed quickly and easily with Amazon Sagemaker. At the end of the process, sports directors and coaches made their selections, without any special training, from interactive dashboards created by Amazon QuickSight.
The processing of this data was carried out in compliance with the RGPD and in order to respect the right to be forgotten a specific AWS Glue job was developed to offer this functionality to runners who wished to opt out of the program.