카테고리 없음

Create New Columns by Calculating the Columns that Already Exists

Panda Kim 2022. 11. 15. 05:56

1. Calculate the columns directly to add a new column

There is a DataFrame scores of different subject from 2019 to 2022.

You can do various calculations directly by the columns.

 

Below is the codes for the DataFrame.

dict_scores = {'Year' : [2019, 2020, 2021, 2022],
               'Math' : [80, 70, 90, 100],
               'History' : [60, 40, 70, 60],
               'Science' : [70, 80, 60, 90]}
               
df_scores = pd.DataFrame(dict_scores)

2. Sum up each row

You want to sum up the scores by each year.

df_scores['Math'] + df_scores['History'] + df_scores['Science']

 

It will return a series of the score sums.

Not only the addition, you can execute subtraction, multiplying, and division.


If you want to add the series to the right end of the DataFrame,

df_scores['Sum'] = df_scores['Math'] + df_scores['History'] + df_scores['Science']
df_scores


3. Get the average of each row

If you want to add a column of the average score by each year,

df_scores['Sum'] = (df_scores['Math'] + df_scores['History'] + df_scores['Science']) / 3
df_scores

Even though the 3 as denominator is a numeric, not a series or a DataFrame,

It can process calculation of DataFrames.


In order to calculate a whole column,

the new data or array has to be in the same type with the elements in the column.