D365 ETL 101: Things to Avoid When Extract, Transform, and Load Your Data3 min read

ETL in Dynamics 365 (also known as extract, transform, and load) is a complex process of data extraction from multiple sources throughout the D365 system. Even though there are a lot of services & tools have been developed to make the process more efficient, organizations oftentimes face challenges.

Since data comes from multiple sources, formats may vary as well – which leads to the need to move data to one or more data stores. Furthermore, the destination might not be the same type of data store as the source and needs additional cleaning and/or shaping before loading into its final destination.

To ensure your ETL is error-free and runs smoothly, here is some 101 guidance on how to avoid rookie mistakes when you extract, transform, and load your data.

Extraction

One of the most common scenarios during extraction is when a team wants to extract data in bulk, not testing beforehand. While this oftentimes leads to a mess of data, we recommend challenging extraction with one piece of data, moving it through the whole process, testing it, and spotting any issues the team may run into.

You can’t predict all of the problems, especially when doing it the very first time, so the extra step of testing won’t hurt. Extraction may look easy at a glance, but trust us – it is not. Usually, you have two teams – one knows the source, and another knows the target – building the bridge between teams is the main goal for Proof of Concept.

Needless to say, there scenarios when a team wants to extract a specific number of tables but have poor knowledge of either source or target systems (or both). The first question we ask in such cases is, ‘what are your plans for the data’?

You need to understand what’s behind the scene in order to pick the right format that your target system will process, punctuation marks and etc. ALWAYS start with data cleaning to avoid tabulation or else.

Monitoring

In order to keep an eye on issues that may occur, like ones during extraction, it’s highly recommended to control the number of recordings in the source system and how many actually got into your transformation tool. If you see that there are, let’s say, 100 records in your CRM, but Excel has only got 82 – that means you’ve got a hammer blow.

Essentially, you are losing some pieces of your data, causing the mess.

Transformation Tools

Another thing to keep in mind is the transformation tools you are utilizing during the ETL process – always keep in mind which tool will be handy for your extraction, transformation, and load. There are some cases when, for example, Excel maybe not be the best choice since there is a limit of data it can seamlessly process. If you break that limit, Excel simply won’t work.

Another issue with Excel is that it is a manual data transformation tool (i.e., not automated), and there is a risk of getting human errors. In case you are working with a huge amount of data, you know better to use specialized transformation tools in order to complete your ETL process.

Make sure to determine who’s responsible for picking the tool – the source or the load team.

How to ensure you are on the right path?

To ensure the whole process runs smoothly, we recommend you run a test extraction before dealing with your overall scope. Both teams, the source and load ones must run at least one extraction through the ETL process and eliminate any issues if any occur. Once that’s done, only they do start working on the whole scope.

Your ‘proof of concept’ extraction should have just enough data to test the waters – just huge enough to see the whole picture. Keep Balance!