Business Intelligence or Analytics
Data is the new gold, and is a key part in driving businesses now. Every organization wants to get more and more data these days so it could make informed decisions and increase customer satisfaction and happiness.
Data could be coming from a lot of sources. If we take an example of a super market, various data sources are card swiping machines, inventory systems, products, supplier systems, employee details, number of stores etc etc.
If a super market wants to launch a new product into their chain they would like to know how their customers might react, when to launch the product, which stores to launch into their product first. These crucial decisions could make or break the product launch. You wonder how? Imagine launching a new ice cream brand in the peak of winter and expecting it to sell out! There’s a lot of analytics and thought process before any product comes to the market for example.
And ETL developers play a crucial role in this whole system. ETL aka Extract Transform Load developers are responsible for integrating data from different systems, and applying business rules on top and preparing them for business consumption. They need to be the masters in understanding data sets, understanding the commonality of datasets, relations between datasets and how to make them talk to each other.
An example data integration workflow in SQL Server Integration Services tool (SSIS) might look like this
What does it take to be ETL developer
- Discussions with vendors : Around access to the source systems (could be a dB like Oracle, SQL Server, SharePoint, Azure Blob, Amazon S3 etc). Some of the source systems might be internal to the organization so might need Active Directory kind of authentication. External systems mostly would have a SQL Username/Password authentication model though.
- Data landing : Once access has been sorted, data needs to be landed into the warehouse. It might involve using tools like SSIS, Datastage, ODI, Powershell scripts or any other scripts for that matter
- Analysis : Understanding data, it’s nature, source database normalisation (1NF, 2NF, Boyce Codd), relations between various datasets, volumes, any date fields to identify today’s data
- Designing : Once data is understood, next is to design it to serve business needs. It involves drawing up a star schema or a snowflake models. Most organizations prefer star, and hence identifying measures, attributes, dimensions, role playing dimensions need for tracking any attributes. Once the designs been done, it’s always good to take the analysts or wider project sponsors, and the reporting through the same to get feedback and see it aligns with their needs as well.
- Development : Next step is to convert the design to a physical model and start applying business transformations etc on top of data.
- Analysts Analysis : Once data is ready, its time to give it to Business analysts so they could play around with it and identify any gaps, issues, change in rules, get a feel of it. It also is a good time to see if the data sourced is fit enough and see if any more data needs to be brought in.
- Presentation Layer : A large part of it might have been achieved already during development, however there might be a need to customize the models to various teams. Example might be creating multiple customer dimensions types like customer leads for sales team, onboarded customers for accounts team.
- Reporting team: After presentation layer design, the models are given access to reporting team so they could start building visuals etc.
- Documentation : This is a crucial step as to write down all aspects of this model for future support, and maintenance.
Some common alias : SSIS Developer, Datastage Developer, Integration Specialist etc