Use case #BIGDATA
by near-real time monitoring
CAF CAF is a multinational group with more than 100 years of experience offering integral transport systems being at the forefront of technology and adding high value in sustainable mobility. As a reference in the railway sector, it offers to its clients one of the widest and most flexible ranges on the market in rolling stock, components, infrastructures, signalling and services (maintenance, rehabilitation and financial services).
Within this framework, CAF launched several years ago an initiative called “Tren Digital”, which led to the creation of the LeadMind platform.
LeadMind provides a new generation of connected trains and more competitive services for railway industry operators and maintainers through the collection, storage, processing and advanced analysis to support real-time decision making and move towards condition/predictive based maintenance.
The transport sector has adopted industry standards 4.0.
Industry 4.0 standards characterised by intelligent systems and Internet-based industrial solutions, has been adopted by the transport sector, especially the railway sector. The use of new technologies is leading to improvements in the quality of services and business models, based on the analytical capabilities of large data and their potential to transform current platforms into a network of collaborative communities that move the transport of goods and passengers. The current trend in automation and data exchange is towards the adoption of new and emerging technologies to achieve greater levels of efficiency and effectiveness.
The solution and main AWS services used
CAF relies on Keepler Data Tech for the integration of AWS technological solutions pursuing two main technological objectives:
- Implement LeadMind Analytics solution in a cloud architecture to ingest, process and storage, both batch and real time, if needed, from train data.
- Generate reports and dashboards with KPIs identified and categorized by CAF.
In order to avoid over-engineering and supported by agile methodologies, Keepler’s approach is the construction of minimum viable products using Big Data processing services and AWS analytics, which allow to validate technologies and approaches used to solve specific problems, with a sufficient scope and measure the effectiveness in a simple way.
The result is a comprehensive solution that receives data from trains and processes the information so that it is properly stored in a Data Lake. The solution allows to ingest data from a limited set of vehicles equipped with diagnostic units (sDiag) and scale to any number of vehicles in the future.
The solution is based on the use of managed services, which achieves a serverless implementation easy to maintain, robust, secure and scalable. The main services used by Amazon Web Services are:
- AWS S3 as main storage repository.
- AWS Athena to query Data Lake using SQL.
- AWS Glue as an ETL tool and Data Catalog.
- AWS EC2 for BI services with TIBCO Spotfire.
- AWS Glacier as backup of old files.
- AWS SageMaker to launch iPython Notebooks, used by CAF data scientists to develop new models.
- AWS Redshift automatically loaded with a subset of data processed from source data to optimize Business Intelligence processes.
- Amazon DynamoDB as metadata storage.
- AWS RDS (with MySQL) as master data storage for field transformations.
- AWS Batch for FTP synchronization.
- AWS Lambda to execute detection application logic in the ETL and near-real time alarms.
- AWS SNS and AWS SES to process errors and near-real time notifications.
- The pay per use model of the public cloud has allowed CAF to have a solution that reduces considerably costs of investment.
- As it is a solution implemented entirely through managed services, the operating cost is reduced.
- All parts of the solution scale horizontally, so the integration of more sensors or the increase in train fleet does not represent a bottleneck and allows agile and automatic escalation.
- It is an open system that allows the integration of any tool, that can be deployed on AWS, for exploiting the information.
- The storage and processing cost is significantly lower. For example, the processing of all the historical data presents a time reduction of more than 90% compared to previous on-premise solutions.
Keepler is a boutique company of professional technology services specialized in design, construction, deployment and software solutions operations of Big Data and Machine Learning for big clients. They use Agile and Devops methodologies and native services of the public cloud to build sophisticated business applications focused in data and integrated with different sources in batch mode and real time. They have Advanced Consulting Partner level and have a technical workforce with 90% of their professionals certified in AWS. Keepler is currently working for big clients in different markets, such as financing services, industry, energy, telecommunications and media.
If you want to know more or if you want us to develop a proposal for your specific use, contact us and we’ll talk.