There’s no escaping it: data analytics is one of the hottest jobs of the 21st century! But what exactly is data analytics, and what does a data analyst do, actually?
There’s no end of discussion and commentary about data analytics online. However, it’s not always easy to find a no-frills description of what a data analyst does on a day-to-day basis. This is made even harder by the fact that data analytics is often lumped in with related fields like data science, machine learning, artificial intelligence, and business analytics. While data analytics plays a key role in all these fields, it is a distinct discipline in its own right.
In this article, we offer a clear, career-focused introduction to data analytics. We’ll cover all the need-to-know knowledge without the fuss, answering:
- What is data analytics?
- What does a data analyst do?
- What tasks and processes does a data analyst follow?
- What skills does a data analyst need?
- What tools do data analysts use?
- Wrap-up and further reading
So, what does a data analyst do? Let’s find out.
1. What is data analytics?
Before diving into what a data analyst does, it’s necessary to answer: what is data analytics? And why is it important? Watch this video for an introduction to the field, or keep reading!
In its simplest form, data analytics is the process of drawing meaning from disordered information. By systematically exploring data for patterns and relationships, data analysts seek to find and communicate useful insights using those data. But what counts as data? Well, pretty much anything you can imagine. Often, data are numerical (quantitative data). But sounds, images, words, or anything else that can be interpreted in some way can also be classed as data (qualitative data).
An analyst’s job begins with what’s known as ‘raw data.’ Raw data are disordered and—without context—essentially meaningless. We can only obtain useful information from them once we have brought order to chaos. As such, collecting, cleaning, and organizing data are all parts of the data analytics process.
What’s more, effective data analytics incorporates many techniques to help the process along. These include statistics, programming, visualization, and more. Luckily, to streamline the process, many of these techniques have been automated. Some are even developing as fields in their own right. However, a good data analyst will have at least some knowledge of them all.
Want to learn more about what data analytics is and what a data analyst does? We cover this topic (and more) in our free introductory short course for beginners. Check out tutorial one: An introduction to data analytics.
Why does data analytics matter?
There are two simple reasons why data analytics matters. Firstly, it’s useful for decision-making. Secondly, it’s evidence-based. Combine these two attributes, and data analytics becomes a potent tool. Basing decisions on empirical information (rather than relying on opinion or ‘gut feel’) is a much more scientific way of approaching problems. While this does not mean data analytics is always 100% accurate, it’s by far the best tool we have for predicting future trends and drawing conclusions about past events.
Data analytics also has a wide range of applications across society. Online, you’ll often find data analytics touted as a tool for business intelligence, e.g. predicting future sales or informing product development and marketing spend.
However, data analytics goes far beyond simply boosting a company’s bottom line. It’s also used in health settings to improve patient care. It’s currently being applied in agriculture to transform the way we feed the world. It’s even used by governments to tackle issues like human trafficking. So if you want to help improve the world—as well as business—a career in data analytics might be for you!
2. What does a data analyst do?
Now we know what data analytics is, let’s take a look at what the role of the data analyst actually entails.
As a data analyst, it’s your responsibility to turn raw data into meaningful insights. Following the data analysis process (which we’ll cover in the next section), you’ll solve specific problems or answer certain questions based on data and the insights it provides.
You’ll then take these insights and share them with key stakeholders and decision makers, who can take action or plan for the future accordingly. At the same time, data analysts may be responsible for overseeing the overall processes for collecting and storing data, as well as setting guidelines for data quality.
A great way to gauge what a data analyst actually does on a day-to-day basis is to look at the tasks and responsibilities that are typically listed in data analyst job descriptions. Based on actual job descriptions posted on indeed.com, here’s what you can expect to do as a data analyst:
- Develop and implement databases and data collection systems
- Work closely with management to identify critical metrics and KPIs, and to prioritize business needs
- Collect data from primary and / or secondary data sources
- Filter and clean data
- Identify, analyze, and interpret trends and patterns in complex data sets
- Visualize and present findings to key stakeholders
- Build and customize reports
- Develop and maintain dashboards
- Create and maintain documentation regarding data models, measures, and infrastructure as they are developed
So far, we’ve taken a rather high-level look at the work of a data analyst. Now let’s zoom in on some of the more specific tasks associated with the data analysis process.
3. What tasks and processes does a data analyst follow?
As a data analyst, your job is to carry out each step of the data analytics process to identify and solve a problem. As your career progresses, you may choose to specialize in a particular area, such as data visualization or data engineering. As a beginner, though, it’s important to learn the process as a whole.
So, what are the key tasks and processes that a data analyst should expect to follow? Although it’s not as straightforward as following one task directly after another (you may find yourself repeating steps, going back on yourself, and so on) the main tasks include:
- Defining a question
- Collecting data
- Data cleaning
- Conducting an analysis
- Communicating your results
Defining a question
First up, you need to define your objective. In some ways, this is the hardest part of the process. This is because what seems like an obvious problem may not always get to the core of an issue.
For example, let’s say you work for a company that wants to boost its revenue. The senior management is set on doing this by launching a suite of new products. As a result, you spend lots of time and resources analyzing what products to create, which market to launch them in, and so on.
However, with a bit more probing upfront, you might discover that there’s nothing wrong with the company’s existing products: it’s simply that the sales process is poor, resulting in low customer satisfaction and less repeat business. With this insight, you might find that investing in sales training will boost revenue at a much lower cost.
While this is just a hypothetical case, it illustrates the importance of probing an issue from multiple angles before investing too much time in it. It also means not being afraid to speak truth to power (in this case, telling managers that their new product idea is wrong). Defining the question you want to answer involves obtaining a deep understanding of the needs and demands of the business, keeping track of metrics, KPIs, and so on. You’ll usually carry out some initial analyses at this stage, too.
Once you’ve identified the question, your next task is to figure out which data are best-suited to help you solve it. This can be quantitative data (such as marketing figures) or qualitative data (such as customer reviews). More specifically, data types can be divided into three categories: first-party data (collected directly by you or your organization), second-party data (the first-party data of another organization), and third-party data (which is aggregated from numerous sources by a third-party).
If you don’t already have access to these data, you’ll have to devise a strategy for collecting them. This might include carrying out surveys, social media monitoring, website analytics, online tracking, and so on. However you collect it, once you have the data at your fingertips, you’re ready to clean it.
Freshly collected data will usually be in a raw format. This means that it hasn’t yet been organized, checked for errors, and so on. To get it into a state that’s suitable for analysis, the data need cleaning. This involves a variety of tools and techniques (such as custom algorithms, generic software, and exploratory analyses) to get it into a more suitable state.
Data cleaning tasks include removing errors, duplicates, and outliers, eradicating unwanted data (i.e. those that don’t serve your analysis), structuring the data in a more useful way, filling in gaps, and so on. When this is done, you’ll validate the data. This involves checking that it meets your requirements. Often, you’ll find it doesn’t, which means you’ll have to go back a step.
For this reason, data cleaning is considered an iterative process. The combined process of collecting and cleaning data is sometimes referred to as data wrangling. You can learn more about data cleaning in this guide.
Conducting an analysis
Once your dataset is clean and tidy, you are good to analyze! There are a great many types of data analysis, and part of the challenge is identifying which approach is best-suited to the task at hand. To keep things simple, we’ll offer a quick overview of the four main categories of data analytics.
The first is descriptive analytics. This involves summarizing (or describing) the features of a dataset to better understand it. It isn’t usually used to draw firm conclusions, but it’s a useful first step for deciding how to investigate the data further.
Next, diagnostic analytics focuses on understanding why something has happened (e.g. by exploring correlations between values in a dataset). This helps identify problems and is often used in the first stage of data analytics, i.e. defining the question.
Finally, we have predictive analysis (which helps to identify trends based on past data) and prescriptive analytics (which helps decide on a future course of action). The latter is sometimes carried out using machine learning techniques.
Communicating your results
Once you’ve carried out an analysis and drawn some insights, the final step is to communicate these to those who commissioned them in the first place. This usually involves visualizing your data in some way—creating graphs and charts, for example. You can learn all about data visualization here.
It may also involve creating interactive dashboards, documents, reports, or presentations. It’s easy to overlook the artistry of this step, but it’s very important to get it right. Not only must you interpret your findings correctly, but you need to share them in a way that is clear for time-short, non-technical personnel. This is important as it ensures any decision-making is based on high-quality, well-understood insights.
4. What skills does a data analyst need?
In some ways, the skills a data analyst needs vary depending on their role. For instance, knowledge of the business you’re working in is very important. However, as a rule, this is something you can learn on the job.
Before nabbing that first opportunity, though, there’s a core set of skills that all beginner data analysts need. We can divide these into hard skills (or technical abilities) and soft skills (or useful personality traits that help you get the job done).
Technical skills for data analysts
Hard skills sometimes have a steep learning curve. However, with a little discipline, anyone can pick them up. Key hard skills for data analysts include:
- Math and statistics: You’ll be mathematically minded. You may have an undergraduate or Master’s degree in an area like applied math, statistics, or computing. However, while qualifications can be useful, they’re not always necessary if you’re a newcomer to the field. As long as you have solid math skills, e.g. algebra and calculus, that could be sufficient.
- Programming skills: To create or tweak algorithms that automate data analytics tasks (like parsing or re-structuring large datasets) an element of programming know-how is unavoidable. Scripting languages like Python or MATLAB and statistical computing languages like R and SAS are all popular in data analytics.
- Database knowledge: As well as programming languages, you’ll need some understanding of database warehousing software, e.g. Hive, and analytics engines like Spark. You’ll also need to know database query languages like SQL.
- Excel skills: Commonly used for transforming raw data into a readable format, or for automating complex calculations, MS Excel is core to any data analyst’s toolset. Be sure to familiarize yourself with its key analytical functions.
- Visualization skills: A core aspect of data analytics is the ability to visualize data with charts and graphs. This helps us identify patterns, correlations, and trends. At the very least, you should be able to create plots using Python, or tables and charts using MS Excel.
- Basic machine learning knowledge: As a beginner, nobody will expect you to be an expert in machine learning—it’s an entire discipline in its own right. Nevertheless, the tenets of machine learning underpin many data analytics tasks. You should be familiar with the theory, e.g. supervised learning versus unsupervised learning.
Non-technical skills for data analysts
While soft skills can be honed with practice, they are generally considered more inherent. You’ll need to have a natural flair for the following:
- Communication: Communication is key In any job, but especially in data analytics. Obtaining accurate insights is the priority, but effectively communicating these to wider audiences is vital. You should have excellent interpersonal skills, be able to communicate complex concepts in straightforward terms, and be confident giving presentations and answering questions for non-technical personnel.
- Critical thinking: Arguably the most important skill in data analytics, critical thinking is the ability to question what’s in front of you to better understand it. You’ll have a naturally inquisitive mindset, won’t take anything at face value, and will approach tasks using logical reasoning and deduction.
- Creative problem-solving: Problem-solving involves applying your reflective way of seeing the world to specific data-related situations or problems. You’ll take a step-by-step approach when defining a problem, devise an approach for solving it, and carry out the necessary subsequent tasks. These tasks will be different every time, so you’ll need a creative mindset.
- Ethics: You’ll understand the importance of data privacy, be aware of your personal biases, and be comfortable presenting outcomes—even when these are undesirable or are unlikely to win you any praise. Adhering to a strong ethical code is hugely important. Without it, data can be easily misused, which can have a real-world impact on individuals and groups affected by your work.
If you’re dipping a toe into data analytics for the first time, ask yourself: do these skills describe you? If not, don’t worry. While it’s important to appraise your strengths and weaknesses honestly, the most important thing is to be enthusiastic about the field and willing to develop the necessary skills. Nobody hiring a beginner will expect you to be an expert right away.
5. What tools do data analysts use?
So far, we’ve covered the skills a data analyst needs and the high-level process and tasks they need to carry out. As a beginner, this may feel a bit overwhelming. Fortunately, there’s a huge range of applications and software to help streamline the process. While these require a bit of technical know-how, once you’ve covered the basics, you should find the whole process a lot easier.
Common tools for data analysts include:
- MS Excel
- Databases and management systems
Let’s take a closer look at some of those now.
MS Excel for data analytics
A must-have for any data analyst is MS Excel. Excel allows you to sort data, break it into smaller subsets, and use a wide variety of functions to understand it better. These functions include pivot tables, search functions like XLOOKUP and VLOOKUP, the AVERAGE function (which gives you the average of a given range of numbers), and the SUMIF function (which lets you calculate the sum of different cells). These tools, along with a great many more, make Excel an invaluable piece of software for beginners and experts alike.
Python for data analytics
The general-purpose programming language, Python, has fast become the go-to programming tool for data analysts. This is partly because of its simple syntax, which makes it quick and easy to learn. However, its popularity is also down to the fact that the Python Package Index (PyPI) offers a massive range of software libraries.
Python can be used for almost any aspect of the data analytics process. For instance, Pandas is excellent for manipulating time-series and other quantitative data. Matplotlib is perfect for data visualization. And NumPy is popular for conducting a range of complex mathematical functions. These are just three of the many thousands of Python packages that are available.
R for data analytics
R, another programming language, is also common in data analytics. While R is generally considered more complex to learn than Python, it remains popular due to its historical use in statistical programming (which has benefits in a field like data analytics). While R doesn’t carry out things like image processing with the ease of Python, it has more data analytics functions built in. It’s also often used in scientific fields. Like Python, R also has a library of software, CRAN, with many additional packages available.
Databases and data management systems
As the variety of data we collect becomes more complex, the way we store and manage these data is also evolving. In data analytics, it’s vital to have an understanding of how databases and data warehouses work. For instance, MySQL is a relatively simple type of relational database management system that is commonly used.
Apache Hadoop, meanwhile, is a more complex framework, used to store, manage and process big data using distributed databases. Whether you’re using simple databases or complex infrastructures, they are ultimately unavoidable!
Structured Query Language (SQL)
SQL (sometimes pronounced ‘sequel’) is a programming language designed to communicate with relational databases. In a world where data is the main currency, this has obvious applications. While relational databases are built using a variety of languages, such as C or C++, SQL allows you to pull, add or edit data without needing knowledge of the database’s native language.
Since most organizations now have information stored digitally or online, SQL is becoming an important language to learn, even for non-analysts. It’s a must-have for those in the field.
Industry-specific data analytics tools
In addition to the tools already described, the industry is starting to produce ever-more sophisticated sector-specific applications to support data analytics. These tools range from general business intelligence software like Microsoft Power BI, to data visualization and dashboarding applications like Tableau.
They also include niche products that you’ll only be likely to learn if you work in a specific industry. For instance, Definitive Healthcare is an analytics platform designed specifically to manage tasks relating to health data.
Want to learn more? Check out this round-up of the top tools used by data analysts.
6. Wrap-up and further reading
In this post, we’ve covered everything you need to know if you’re just starting in data analytics. We’ve explored what a data analyst does, what skills they need, and the basic tools that a beginner analyst should aim to learn.
Once you have all these skills at your fingertips, you’ll soon be ready to enter the field. Whether you’re interested in data analytics for e-commerce, finance, healthcare, government, the sciences, or any other area of your choosing, one of the great benefits of the field is its versatility. With a little experience under your belt, you can branch into broader data science, or specialize in areas like data engineering, data modeling, or machine learning.
For a deeper taste of what data analytics involves, try our free, five-day data analytics short course. Want to learn more about a career in data? Take a look at the following: