Imagine walking into a new data analyst job in a new organisation. This company has more than 10 core systems, 3 different major data warehouses, a semi transparent data lake, an unlimited source of reports and dashboards, with outdated, incomplete documentation.
This is your first job as a data analyst. Your responsibility is to find the proverbial needle in the haystack, that “actionable insight”. Or at least, that’s what you thought.
To me, data analysts are the unsung heroes of data teams within an organisation. Overshadowed by the data scientists, and easily confused with business analysts, they are the glue that keeps a data team moving, and yet, they are often misunderstood.
The rapid growth of data over the past two decades has brought with it a large amount of change. It was a world where the concept of a dashboard was new, “ELT” was a typo and data scientists as we know them today did not exist.
There were no Data Science courses. I pushed myself to attend Statistics courses and learned text analytics from analysing DNA strains. I was a Computer Science and Marketing major. I had no interest in Biology and constantly questioned my life choices throughout the course.
With that change, the role of a data analyst has changed significantly, and so to has the problems they are facing.
What’s in a Data Analyst?
Data analysts work with data to help their organizations make better business decisions. Using techniques from a range of disciplines, including computer programming, mathematics, and statistics, data analysts draw conclusions from data to describe, predict, and improve business performance. They form the core of any analytics team and tend to be generalists versed in the methods of mathematical and statistical analysis.¹
Data Analysts are definitely generalists, because what you’ll find is that the role varies depending on who is in your data team and how they operate.
For example, if you have data engineers in your team, in most cases you would not have to worry about collecting and cleaning the data. If you work with business analysts, there is less onus on you to have to define the business requirements. This works in the reverse too, naturally Data Analysts adapt their skills depending on what gaps are missing in a data team.
This is reflected in job descriptions, where Data Analysts are commonly expected to have skills in areas that are usually attributed to other roles. Here’s an example of two different Data Analysts positions:
While it's a strength that Data Analysts are seen as generalists, this can be difficult if you are being asked to do something that isn’t in your skillset.
This is just one obstacle that Data Analysts have to deal with. What other obstacles do new Data Analysts have to deal with and what can you do about it?
Obstacle #1: Too many systems and no documentation
There can be hundreds of different systems operating in a business, with no documentation to be found.
Not only that, but organisations seem to love acronyms, and unfortunately half of them are probably related to the systems that you are supposed to be an expert in.
Tip #1: Get the lay of the land
In scenarios where you have no documentation, your best and most knowledgeable resource are your new colleagues.
To help learn quickly, start writing down the different systems (or acronyms) you hear about and how you think they are all related. Then get a colleague to review it for you.
A lot of new hires are constantly worried about how they are perceived by their new team when they start, too afraid to ask “stupid” questions, but if there’s ever a time to ask a question, it's in your first few weeks on the job.
Obstacle #2: Unclear requests
“I want to know what are the most popular types of products that customers are buying, the frequency, and the channels through which customers are making those purchases.”
That’s a fair request. But further clarification is always useful, especially if you are new, have little context, and are unsure of what the result should look like.
Tip #2: Ask the right questions of the right people
The key to dealing with unclear requests is knowing what questions to ask to help clarify the problem and who to ask them to. What you want to avoid is the constant back of forth that is associated with unclear requests, or worse, the possibility of undertaking the work only to be told your results are wrong but have no indication as to why.
To the requestor:
- Define “most popular”. Is that by quantity, amount, or something else?
- What is the date period?
- How would you like your results presented?
- Do you have any previous reports that I can check my numbers against?
To the data team:
- What’s the most reliable data source for this area?
- Has any other work been done on this that I can look at?
Obstacle #3: Constant ad-hoc requests
The number of ad-hoc requests can feel like an onslaught.
A never ending onslaught.
Over time you will start to see a pattern in the type of requests you receive, and in no time you’ll receive a similar request, but instead focusing on a different date period or filtering out certain conditions.
Tip #3: Save your common queries and code (with comments!)
Organise and save your common queries and code. Try to get into a habit of including parameters in your code, like dates, so that they are quick to change later.
As with all programming, commenting your code is important, but arguably even more so when working with data. Perhaps there is some quirk with the data, and you had to take that into account with your code, seems reasonable right? Sure! But reviewing the same uncommented code months down the line might show that it no longer makes sense. So don’t forget to comment! You’ll thank yourself later.
Tip #3b (Bonus!): Save the request and the output
You are going to get the same request again at some point. So while its good that you are now saving your queries and code, you might have forgotten what the actual request was, and the associated output.
To save you hours trawling your email or Jira for work that you completed 6 months ago, get into a habit of saving the request and the output.
Tip #3c (Bonus!): Use this information for your next performance review
It's easy to remember big projects when it comes to your performance review. But those requests start to add up, and over time you will forget what you’ve done. Partitioning the requests and outputs by year and month will make your next performance review a cake walk.
References
[1] T. Olavsrud, What is a data analyst? A key role for data-driven business decisions (2020), CIO.com