ETL…WTH is that! We all love acronyms, right? Especially when they mean absolutely nothing to us. With that in mind let’s roll this out a little. Data management can be seriously complex, requiring some really smart people to first find the data and make it available (Extract), then make it useable in a format that makes sense to the end stakeholder (Transform) and lastly to make it accessible in an interface/location that the business needs (Load).
This can be difficult to accomplish if you either don’t know what you are doing or don’t have the required technical team to make it happen. That’s where an ETL comes in. These systems are preconfigured to connect to many data tools (databases, data warehouses, data lakes) to Extract, then Transform, and Load the data to an endpoint such as a Business Intelligence tool (Power BI, Looker, Tableau, etc, etc).
This reduces the need for in-depth technical knowledge or expensive development efforts.
If you have not come across ETLs before here is a short list of the premium and Opensource ones you could look at.
ETLs allow you access to various data sets using APIs (Application Programming Interface). Most modern systems have an accessible API that one can access.
Premium ETLs: (Usually cloud hosted and at a price point)
- DataDDo
- DataSlayer
- Power My Analytics
- Stitch Data
- Hevo
- Coupler.io
Opensource Self Hosted ETLs (No cost if you can set them up in your own server environment)
- Mage.ai
- Airbyte
- Apache Airflow
- Singer
Each has their pros and cons but the two you need to really worry about are cost and internal resources. The premium ETLs will have a monthly/yearly subscription you will need to pay. The open source will cost you on your internal resources such as server space and usage as well as potentially requiring a technical person to support that setup. If you are new to this, the best option is starting with a premium subscription before you actually know what data combinations are valuable to you.
So that is the Extract and Transform aspect of data management. With Load though, you generally will want to make that data available to the business unit owner. This means visualising the data to make it useful.
Visualisers are systems that allow raw data to be transferred into visually usable data. Common options include PowerBI, Looker Studio and Tableau.
Again depending on where you are in your data journey and the complexity of your data requirements you can start at premium or open source. For entry level visualisation when using common datasets and/or ETL connected data try Looker Studio. It’s free and reasonably simple to use. If your data gets more complex and you now need to write custom queries using SQL or python you need to look at one of the more premium options or open source advanced options.
Premium Options:
- Tableau
- MetaBase (Cloud hosted)
- PowerBi Pro
- WhatAGraph
- DataBox
There are many!
Opensource (Self service):
- Looker Studio
- Power BI (Downloadable)
- Meta Base self hosted
- Tableau Public
- PyGWalker
In either case there will be pros and cons to each of these systems. It comes down to cost vs complexity of self managing an option. Either way, these tools will be able to connect your website data (GA4) with your CRM data (Ecommerce and user data). Once this is done you can begin to visualise this data in one location for greater, more accurate, insight.
Some notes: Although if we make this sound simple, keep in mind that once the data is connected you may encounter some additional complexities such as data type mismatches or data structure mismatches. Examples could be things like OrderID count. In GA4 it may be a numeric value but in the CRM it is a string. This mismatch will make the data incompatible and you will need to do some additional work switching one or the other data type so they work. Another common error we come across is on the data structure. Some systems will use DD-MM-YYYY whereas others may use YYYY-DD-MM. These two are not directly compatible so you may need to amend them.
Thanks for reading the series! With any luck it was helpful. All the best in your data journey until part 3!