SUCCESS CASE #BigData #DataPlatform
Relational Data Platform Design and Construction
STRUCTURALIA is an international graduate school specialized in engineering, infrastructure, energy, building, digital transformation and new technologies.
It has trained more than 115,000 students in 52 countries. It has offices in Spain, Colombia, Mexico, Peru, Chile, Puerto Rico and Central America.
Adapting the business to current and future needs
According to Harvard Business Review, the oil of the 21st century is the data that every company generates. But not all companies know or are prepared to extract value from it, which has a direct or indirect impact on their revenues.
Many companies have had data processing systems in place for some time, but these are structures designed to solve a specific situations or not designed to meet future needs, such as data exploitation.
It is essential to build elastic and scalable technological infrastructures that can perform in current environments and predict their performance under different conditions. Public cloud environments allow and facilitate the design of modernized and self-managed infrastructures suitable for the evolution of data platforms that allow ingesting, processing and exploiting information with greater capabilities. This will eventually lead the company to become a data-driven organization, that is, a company that can make decisions directly from its generated data.
Structuralia had data platform and data ingestion and transformation systems in AWS. But these processes and data platform did not have the reliability, scalability and growth capacity that Structuralia demanded to be able to provide advanced analytics and BI services to its customers.
To address these shortcomings, Keepler began a process of modernization of its platform, designing and building a data platform in AWS connecting with relational databases, APIs and third-party; allowing the orchestration of workflows for data ingestion and transformation with the possibility of data ingestion in RT, NRT and batch format. To achieve this, three milestones were defined in order to meet the client’s requirements:
- The architecture should allow implementing the first use case and scaling with new data sources and analytical needs in an agile and flexible way.
- Relational data ingestion must be done by building connectors and orchestrating the data flow with the selected data sources.
- Data must be accessible via APIs and exploitable both directly through a BI tool and indirectly.
In order to achieve the milestones set for this project, the following main objectives were established:
- Securely ingest data through the Database Migration Service (DMS). The loads performed through this process will be total and daily.
- Create a Data Lake that allows us to centrally manage the stored data and process the stored data in such a way that it provides value to the different subsequent stages.
- Provide an extra security layer to access the data stored in the Data Lake.
- To be able to generate reports and replicate existing dashboards based on the information contained in the Data Lake through Quicksight.
Technology used for the development of this project has been Amazon Web Services, which comprises a series of cloud computing tools and services. Among its services, the following have been used:
Amazon Aurora: Relational database management system (RDBMS) built for the cloud with full support for MySQL and PostgreSQL.
Data Migration Service: service for migrating databases to AWS quickly and securely.
Lake Formation: service that allows the management of Data Lakes.
S3: object storage service where data is hosted.
Glue: metadata repository.
Athena: interactive query service that facilitates the analysis of data in Amazon S3 and other federated data sources using standard SQL.
Quicksight: Cloud-scale business intelligence (BI) service that you can use to deliver easy-to-understand information to the people you work with, wherever they are.
Keepler is a boutique company of professional technology services specialized in design, construction, deployment and software solutions operations of Big Data and Machine Learning for big clients. They use Agile and Devops methodologies and native services of the public cloud to build sophisticated business applications focused in data and integrated with different sources in batch mode and real time. They have Advanced Consulting Partner level and have a technical workforce with 90% of their professionals certified in AWS. Keepler is currently working for big clients in different markets, such as financing services, industry, energy, telecommunications and media.