To reduce costs of his warehouse of analytical Data, WeWard appealed at TeamWork, which has set up an original infrastructure, judiciously mixing multipal Amazon Cloud services. A project completed in less than three months.
A French startup created in 2019, WeWard offers a mobile app aimed at fighting against sedentary lifestyles. The more users walk, the more they accumulate wards, which can be exchanged for various rewards : gifts, vouchers, money, donations…
Since its launch, the app has been regularly enriched, with affiliation, allowing you to buy products while earning wards, and increasing gamification: periodic contests, performance tracking and sharing, etc.
The WeWard app is now deployed in 9 countries and has 20 million users, mainly located in France, but also in other European countries, such as Italy and Spain. Recently, the publisher took its first steps in the United States and Japan.
Separate transactional from analyticals
“We rely on a very large database, in which we store all the information we need to run our application and services,” explains Jean Le, DevOps Engineer at WeWard. “All transactions go through this database, which is used by both our mobile app and our back office, which can cause performance issues .So when we run a big internal query to conduct a study for example – the front end can be slowed down, which frustrates users.”
On the one hand, the database must support a massive flow of transactions from the users of the application, and on the other hand, particularly large and complex internal queries from the data engineers. Two modes of data access (transactional vs. analytical) that are difficult to reconcile. In order to solve this dilemma, WeWard wanted to set up a data warehouse separate from the main database, in which data engineers can launch their queries, without impacting the performance of the mobile application.
“Our PostgreSQL database is instantiated on Amazon’s cloud. We reached out to AWS teams to see how we could build a stronger, more scalable infrastructure. They suggested three of its partners, including TeamWork, which I already knew and who was chosen to accompany us on this project. TeamWork’s experts were able to provide an objective look at our problem and how to solve it, while respecting our needs and our budget.”
Specific support has been set up by TeamWork to meet the startup’s cost requirements: reduction of development costs via the TeamWork Mauritius subsidiary, access to AWS funding as a Premier Tier Services Partner, etc.
A serverless Big Data approach
In order to reduce operating costs as much as possible, the WeWard and TeamWork teams opted for an original solution, consisting of relying on the S3 data storage service with AWS Glue Catalog. “We had a fairly precise idea of the architecture we wanted to put in place,” says Jean Le. “TeamWork helped us optimize it, and then set up the tools to extract the data, transform it, and load it into a secure storage space.on which our data engineers will launch their queries.”
Each day, an Amazon EC2 instance is provisioned. It connects to WeWard’s PostgreSQL database to extract the desired information, which is then transformed via the AWS Glue service and pushed into Amazon S3. Data that is no longer needed by the application is then redacted from the PostgreSQL database, before the Amazon EC2 instance is stopped.
When data engineers need to run scans, they use the Amazon Athena SQL query service, which then connects to Amazon S3. This unique architecture allows cloud resources to be consumed only when Amazon S3 storage is being fed with data or queried.
Code-driven infrastructure
This entire data infrastructure is controlled by code. A requirement of WeWard. “At WeWard, we operate in DevOps mode and we love Infrastructure as Code. So we wanted this solution to be entirely managed by code, through the Terraform solution.”
Well mastered, the project was completed in less than three months, the most complex part having been to ensure that this new tool would not impact the performance of the production database during extractions. Another point of attention is the need to preserve the system’s compliance with the GDPR.
“I am very satisfied with the result: the architecture put in place is simple, efficient and efficient,” sums up Jean Le. “The relationship was very good with our TeamWork referent and his team. TeamWork was able to answer each of our questions, understand our issues and give us good advice on the possible technological choices. It should also be noted that the Terraform code was written according to current best practices. It is not a black box that TeamWork delivered to us.
WeWard is already working on the next evolution of its Big Data infrastructure. Reflections are therefore underway on the possibility of aggregating all the data in a single repository.