What were the Drivers?
- Consolidated view of data from industry leaders
- Lack of cutting-edge Analytics on the industry data
- Strategic need to establish as the authoritative data provider in the industry
- Lack of Slicing and Dicing of Risk Management complex KPI’s across various dimensions
- Need to make BI self Service across the organization
- Need to create a client-centric reporting framework.
Engagements
One of the top eCommerce APIs and the end to end eCommerce software services provider of the US were working towards improving their sales. They had both internal and external pressures to increase overall sales.
Cloud Data warehouse(AWS) and Data Analytics
The Solution
Approach & Solution
- Migrated different types of data from different sources/formats (Salesforce, JSON, XML, Java API’s semi-structured data) used Talend ETL tool and loaded the data into a warehouse hosted on AWS.
- Used Spark with python to load data with great performance
- Used KMP Algorithm, POSIX operators to find & match patterns to standardize data.
- Real-time data push from sources to the target server to create Power BI visualization
Benefits & Outcome
One of the top eCommerce APIs and the end to end eCommerce software services provider of the US were working towards improving their sales. They had both internal and external pressures to increase overall sales.
Cloud Data warehouse(AWS) and Data Analytics
Tools
- Talend Studio
- Java code for API calls
- Python code for AWS RS UDF (Analytics)
- Spark with Python (For Data loading)
- Amazon Redshift and S3
- Power BI