I love the way that tools like Power BI have made creating analytics approachable for a much wider range of users. The skills acquired when doing this are a great start for using analytics in enterprises, but an area that I still see missing is data modelling.
What I see users doing is this:
- Decide what the output reports, etc. should look like
- Design tables that match the reports.
And this is where things go wrong.
Reports aren't tables
Take an example where I have customers who have accounts. A report on customers seems to match pretty well with a table i.e. it has some sort of identifier for the customer, their name, their address, etc.
But then the user decides that they need an accounts table. They look at an accounts report, and see that it has things in it like the name of the customer, or their address. And so they model an accounts table with attributes of the customer, as well as attributes of the account.
The same thing happens with the customer. The customers report might have had some items that are account-related. So those attributes get added to the customers table as well.
You can imagine how this evolves. It's well-intentioned; it lets them get data quickly; but for an enterprise? It's the start of an inglorious mess.
If you're starting to create tables for enterprise analytics, you really need to learn about data modelling. Don't assume that tables are just copies of what's needed in a report.