Lab 03
Lab 03: Pandas
This lab covers the Pandas, including topics such as data cleaning, and data manipulation.
There are five questions in total. Please provide your code answers directly below each question.
Make sure to run all cells so that the answers are stored. Once completed, export your .ipynb file as either HTML or PDF, ensuring that all answers are included. Submit the HTML or PDF file to Canvas by midnight (11:29 PM) on September 22.
Read the vehicle_crashes.csv to finish the Lab 03.
This lab is worth a total of 65 points and contributes 6.5% toward the final grade.
1. Provide the code to read in the data into DataFrames.¶
1.1 Access data Part 1 (10 points)¶
- How many rows and columns are present in the DataFrame?
1.2 Access data Part 2 (10 points)¶
- Print the name of all the columns in the DataFrame
1.3 Data Cleaning (10 points)¶
- Create a new Dataframe only include below columns
'CRASH_DA_1', 'CRASH_SEVE', 'NUMB_VEHC', 'STREETNAME'
- Print the first 8 rows using .head():
1.4 Data Cleaning Part 2 (10 points)¶
- Rename the column names
'CRASH_DA_1' to 'CRASH_DATE'
- Print column names
1.5 Converting data type (15 points)¶
- Convert data type of CRASH_DATE to 'DateTime' and store it in a new column
- Print the first 5 rows of the dataframe
- Print a concise summary of dataframe
1.6 Aggregating Statistics (10 points)¶
- Grouping NUMB_VEHC based on STREETNAME
- Sort the dataframe from largest value of NUMB_VEHC to smallest value of NUMB_VEHC based on aggregating statistics