class: center, titleslide
# Python Course # Part 2: Handling data with Pandas --- layout: true class: mainlayout --- class: tocslide .left-column[ ## Pandas
Library ] .right-column[
] --- class: tocslide .left-column[ ## Pandas
Library ] .right-column[
] -- .right-column-next[
] --- class: tocslide .left-column[ ## Pandas
Library ## Agenda ] .right-column[ ### What are we going to discuss today
1. Terminology 2. Specific topics: - Open files - Saving files - Navigating dataframe - Select data - Create new columns - Merge data - Groupby operation - Plotting with Pandas - Plotting with Seaborn ] --- class: tocslide .left-column[ ## Pandas
Library ## Agenda ## Terminology ] .right-column[ ## Terminology
### Pandas vs. Numpy Numpy provides a powerful N-dimensional array object.
Pandas builds upon the Numpy functionality. ] -- .right-column-next[
### pd.DataFrame vs. pd.Series A Pandas Series is a 1D data structure (like a vector) A Pandas DataFrame is a 2D data structure (like a matrix)
Columns and rows in a DataFrame are Series. ] --- class: tocslide .left-column[ ## Open data ] .right-column[ ## Opening data Pandas can open pretty much any data file!
Opening and Saving files with Pandas notebook
] --- class: tocslide .left-column[ ## Open data ## Save data ] .right-column[ ## Saving data Pandas can save to pretty much any data file! (except SAS)
Opening and Saving files with Pandas notebook
] --- class: tocslide .left-column[ ## Open data ## Save data ## HDF files ] .right-column[ ## HDF files
Tip: HDF files are awesome!
] --- class: tocslide .left-column[ ## Open data ## Save data ## Navigate ] .right-column[ ## How to inspect your data?
There is no standard data browser for DataFrames ## My recommendation Use basic operations to view parts of the data in the notebook:
] --- class: tocslide .left-column[ ## Open data ## Save data ## Navigate ] .right-column[ ## Alternative, use the QGrid extension
] --- class: tocslide .left-column[ ## Open data ## Save data ## Navigate ## Select data ] .right-column[ ## Selecting data
Selecting data based on a condition, Jupyter Notebook
] --- class: tocslide .left-column[ ## Open data ## Save data ## Navigate ## Select data ## Create
Columns ] .right-column[ ## Creating columns
Various methods to create columns, Jupyter Notebook
] --- class: tocslide .left-column[ ## Open data ## Save data ## Navigate ## Select data ## Create
Columns ## Merge data ] .right-column[ ## Merging DataFrames
Various methods to merge, join, and append, Jupyter Notebook
] --- class: tocslide .left-column[ ## Open data ## Save data ## Navigate ## Select data ## Create
Columns ## Merge data ## GroupBy
Operation ] .right-column[ ## GroupBy Operations
Various methods to merge, join, and append, Jupyter Notebook
] --- class: tocslide .left-column[ ## Save data ## Navigate ## Select data ## Create
Columns ## Merge data ## Groupby
Operation ## Plotting ] .right-column[ ## Plotting data (Pandas and Seaborn)
Comprehensive notebook for plotting with Pandas
] --- class: tocslide .left-column[ ## Get
Started! ] .right-column[
What is next?
## Demonstration: Watch the demonstration video, see Discord for the link. ## Problems: Solve tasks in the "pandas_mini_problems.ipynb" notebook. ] --- class: tocslide exclude: true .left-column[ ## Closing
remarks ] .right-column[
Questions?
] --- class: tocslide exclude: true .left-column[ ## Closing
remarks ## Demonstration ] .right-column[
Demonstration
] --- class: tocslide exclude: true .left-column[ ## Closing
remarks ## Demonstration ## Mini-Task
Instructions ] .right-column[ ## Mini Tasks **Goal:** Get hands-on experience with Pandas using actual data. ### Instructions 1. Open (start) a Jupyter Notebook in the `limperg_python_2019` folder 2. Solve tasks in: `minitasks > day_2 > pandas_mini_task.ipynb`
### You will need these notebooks: -
Python tutorial
-
Python Basics Notebook
-
Opening files with Python / Pandas
-
Data handling with Pandas
]