pandas and Data Analysis

Welcome to Learning pandas! In this book, we will go on a journey that will see us learning pandas, an open source data analysis library for the Python programming language. The pandas library provides high-performance and easy-to-use data structures and analysis tools built with Python. pandas brings to Python many good things from the statistical programming language R, specifically data frame objects and R packages such as plyr and reshape2, and places them in a single library that you can use from within Python.

In this first chapter, we will take the time to understand pandas and how it fits into the bigger picture of data analysis. This will give the reader who is interested in pandas a feeling for its place in the bigger picture of data analysis instead of having a complete focus on the details of using pandas. The goal is that while learning pandas you also learn why those features exist in support of performing data analysis tasks.

So, let's jump in. In this chapter, we will cover:

  • What pandas is, why it was created, and what it gives you
  • How pandas relates to data analysis and data science
  • The processes involved in data analysis and how it is supported by pandas
  • General concepts of data and analytics
  • Basic concepts of data analysis and statistical analysis
  • Types of data and their applicability to pandas
  • Other libraries in the Python ecosystem that you will likely use with pandas