Even the most tech-savvy data analysts (or aspiring data analysts) can benefit from a digital detox at times. What better way to take a screen break than by curling up with a good book? If the latest fiction best seller isn’t your thing, why not check out a tome that will help you get to grips with a new aspect of data analytics?
In this post, we list a careful selection of our favorite books for data enthusiasts. We’ve grouped these into the following sections, and have deliberately chosen data books we think complement each other well, but you can decide!
- Getting started: Broad introductions to data
- Expressing insights: Data visualization
- Upskilling: Getting to grips with statistics
- Applications: Data analytics in business
From the broad-ranging to the nitty-gritty, here are 12 books for aspiring data analysts:
1. For beginners: Broad introductions to data
Amid the hype and potential horrors of sentient machines wiping out humanity, British mathematician Hannah Fry takes readers on a balanced but unflinching tour of the pros and cons of our ever-more algorithm-driven society. With wit and precision, Fry looks at how data and algorithms have the power to transform our world for the better. She doesn’t hold back on examples—for instance, they have the potential to improve our justice system and advance our healthcare. But Fry doesn’t shy away from exploring areas where our blind faith in algorithms can potentially lead to dystopian horrors. Think the destruction of democracy! A well-rounded introduction to our data-driven world, this book is a funny and fascinating love letter to data and is suitable for those who are completely new to the field. Highly recommended!
Perhaps the most important skill for any data analyst is the ability to think critically and overcome one’s biases and expectations. The Drunkard’s Walk by U.S. physicist Leonard Mlodinow (who, for the record, was a close friend of the late, great Stephen Hawking) tackles the issue of randomness, chance, and probability in our daily lives. While this might not initially seem that relevant to the field of data analytics, the book explores how our reliance on statistics for everything—from political polls to student grades and financial markets—is not as infallible as it seems. Irreverent and clear in his explanations, Mlodinow illuminates some of the more complex aspects of probability and statistics, using language that anyone can understand. This book should help any budding data analyst appreciate the importance of data, while understanding that data analytics goes hand-in-hand with critical thinking skills.
Fascinated by self-driving cars and computers that can beat humans at chess? Want to know how Netflix figures out what you want to watch with such a high level of accuracy? Look no further. Written by two expert machine learning engineers, How Smart Machines Think is the ideal introduction to artificial intelligence and machine learning for those who know next to nothing about the topic. The book explores both the theory and the practice of creating machine learning algorithms, explaining both how they work (via reinforced learning, much in the way a dog is trained with treats) as well as the software architecture behind famous deep learning and artificial neural networks, such as DeepMind’s AlphaGo. The book also gives a voice to the experts behind these cutting-edge technologies, making it an all-round box-ticker for any data analyst interested in the topic—and if you aren’t interested yet, you certainly will be once you’ve finished reading!
2. Expressing insights: Data visualization
Cartographies of Time: A History of the Timeline—Daniel Rosenberg and Anthony Grafton, 2010
About a decade ago, there was a huge hype around infographics—where exactly it came from, who knows, but suddenly companies everywhere were representing their histories (usually badly) on some kind of graphical timeline. The impression was that this was somehow a new idea. This book puts an end to that notion, taking the reader on a historical journey through one of the first types of data representation—the timeline. A history of graphic representations of time in Europe and the United States, Cartographies of Time highlights that the timelines are not the preserve of 21st-century marketers, but have been around for centuries. From representing the genealogies of Christ using human body parts, to charting ships at points in time (rather than geographic location) this book is a fascinating visual treat. It’s stuffed with great illustrations, too, making it a lush addition to our list!
From the history of data visualization to a practical guide, The Functional Art offers tips for using data viz to represent important insights. Written by data journalist Alberto Cairo, the book leans towards data viz for public consumption but the principles can be broadly applied. A practical introduction, it explores how turning figures into graphics can help the human brain better comprehend information. Cairo introduces everything from statistical charts, maps, and explanatory diagrams and how these are commonly used across industries. The important thing about this book is that it relies on core underlying principles, namely driving home how data viz best practice and beautiful representation should go hand in hand—neither be prioritized at the expense of the other. A must-read for any newbie data viz enthusiast.
If you want a book that’s a little less didactic and isn’t back-to-back text, then this is the one for you. Writer and designer, David McCandless, has published several books on data visualization, and it’s hard to choose between them! However, we’ve selected this one as this book is a true piece of artwork—a visual libation to data viz. McCandless’ genius eye shows how to represent data that are too complex or abstract to be understood in any other way. This inspirational piece demonstrates many ways in which we can blend data points, representing their relationships to one another in beautiful but meaningful ways. The author doesn’t only focus on visuals, though, but highlights ways of connecting datasets that many might not think to compare. A book you’ll want to take your time over, and a future coffee table favorite, it’s well worth checking out.
3. Upskilling: Getting to grips with statistics
Statistics is a fundamental skill for any data analyst. But before adopting the tools necessary for carrying out statistical analyses in a workplace setting, you need to get the basics down. In The Art of Statistics, renowned statistician David Spiegelhalter is on-hand to help, specifically aiming to improve the reader’s statistical literacy. After covering the ‘basics’ (we use quotes, since you’ll need a solid foundation in math to grasp the concepts), Spiegelhalter gets behind the theory to explain how you can use different models to pull accurate insights from raw data. Using lots of real-world examples to bring the concepts to life, the book introduces all the statistical techniques you’ll need to start your journey in data analytics. It’s also a great reference book for returning to.
Once you’ve nailed the basic statistical models, start learning the tools you’ll need to apply them. Enter Python for Data Analysis. Written by the software developer behind the pandas Python library for data analysis, this book will cover everything you need to know about the most common programming language in the field. McKinney looks at the process of manipulating, cleaning, collating, and analyzing data using Python, adopting hands-on tasks so you can play around with Python and its features, using the book as a guide. From Python’s basic numerical features, to creating scatterplots and using the language for problem-solving in areas like social sciences and economics, the book is packed with examples and case studies. A great one for introducing what could otherwise be a tough topic.
Practical Statistics for Data Scientists: 50+ Essential Concepts Using R and Python—Andrew Bruce, Peter C. Bruce, and Peter Gedeck, 2020
Bridging the gap between programming and statistics, this book—co-written by three renowned data experts—will expand your knowledge of statistics using both the Python and R programming languages. Acknowledging that most data analysts aren’t formally trained in statistical programming, Practical Statistics for Data Scientists takes a data-analysis-specific look at statistical problem-solving. The great thing about this book is that it’s not just a ‘how-to’ statistics guide. It also links the concepts to fundamental data analytics theory, such as why exploratory data analysis is so important (and how to carry it out). From concepts such as random sampling and experimental design to techniques like regression and classification, this book covers it all, while acting as a useful training guide for Python and R.
4. Applications: Data analytics in business
Data Science for Business: What You Need to Know about Data Mining and Data-Analytic Thinking—Foster Provost, Tom Fawcett
Feel comfortable with the basic techniques and tools we’ve covered so far? Then perhaps it’s time to build on the concepts you’ve learned in a business intelligence context. More of a technical guide than any of the other data books we’ve listed so far, Data Science for Business includes both the math you’ll need to grasp and apply various statistical models, as well as the wider contexts in which you’ll use them. The book is based on an MBA course taught for over a decade by Foster Provost at New York University. Although not ideal for beginners, it’s definitely comprehensive and uses excellent real-world business examples to lift the concepts off the page. Perfect if you want to dive a bit deeper and test your intellect.
Business Data Science: Combining Machine Learning and Economics to Optimize, Automate, and Accelerate Business Decisions—Matt Taddy, 2019
If you’re looking for tools you’re likely to use, rather than an encyclopedia of concepts, Matt Taddy delivers. With hands-on experience at companies like eBay, Microsoft, and Amazon, his expertise—in the fields of economics, big data, and machine learning—is at the cutting edge of technologies being used within data analytics today. You’ll need some statistics know-how before diving in with this tome, but for the most part, the book is written in an appealing, chatty manner that should appeal to everyone from business leaders to data engineers. What makes this book stand out though, is that it goes beyond just listing applications and techniques. Rather, using real-world examples, Taddy shares his personal insights on the use of data science in business, which makes it feel like a real treasure trove of hidden secrets.
Creating successful businesses means more than having the right practical data skills. It also means understanding the systems that underpin our work. In data science, this means becoming aware of our built-in biases. Invisible Women shines a light on this issue, exploring how vast amounts of data fail to account for gender, treating men as ‘the norm’ and women as atypical. Without apportioning blame or shame, the book simply states the facts, showing that baked-in biases shape everything from how our technology is designed for men, how our healthcare is built on the male anatomy, and how the way our society is subsequently shaped impacts negatively on women. While this book focuses on gender inequality, it’s a must-read for any data analyst looking to expand their awareness of how all different minority groups are represented (or not represented) in big data. We hold a great responsibility for others in our hands, and we must take that responsibility seriously.
There we have it—12 carefully curated data books catering to aspiring data analysts of all experience levels. Whether you’re still learning the basics, or are ready to dive in with tools like Python and R, we hope you’ll find something on our list to enjoy!
Brand new to data analytics and want to test the water before splashing the cash on a book? Why not check out this free, 5-day data analytics short course? You can also explore the following introductory data analytics posts: