Day 4 of #100daysofcode #Dataengineering
Started the day with yet another awesome way. It's my day four and I'm happy for the progress so far. Most time I tried to push myself to the limit, just to be able to come through.
My main focus for today was based on Index and Update.
INDEXES
I learned that index are identifier in a row in pandas and that its unique. Below is a simple line of code on how indexes work.
Case: Say for example you have a Dataset with four colums namely; emp_no, first_name, last_name and email, and you want to set the emp_no as the index. You simply have to call it in.
import pandas as pd
Julius = pd.read_csv('file_location', index _col = 'emp_no')
The above mean that you can call the index in when importing your data set. But if you don't want that you can also make use of the code below
Julius = set_index ('emp_no', inplace=True)
The presence of inplace=True changes the default behaviour such that the operation on the dataframe doesn't return anything, it instead modifies the underlying data.
UPDATE
Today i also learn that pandas dataframe update is one of the best functions provided by pandas packages. It saves my time to make a new dataset every time some values changes in the existing dataset. These are examples of how I update dataframe in Pandas.
Updating columns
Case 1: Say you have three columns; first, last and emailPri and you want to update the columns in the dataframe, all you have to do is to pass a list. Consider the code below.
Julius.columns = ['first_name ', 'last_name' 'email']
Case 2: Say you want to update a specific column , all you have to do is to set the dictionary of the column to rename and map it to a new column name.
Julius.rename =(columns = {'first_name' : 'her_name' , ' last_name' : 'the_name'}, inplace=True)
You can also take a look at the pictures below to see how i update the column named last and make all of them to be a lower case.
Post a Comment