Let's say you have a DataFrame of 4 animals like below.
The codes below create the DataFrame.
import pandas as pd
dict_zoo = {'Name' : ['Panda', 'Rabbit', 'Lion', 'Duck'],
'Food' : ['Carrot', 'Bamboo', 'Meat', 'Fish'],
'Age' : [5, 2, 7, 3],
'Species' : ['Mammals', 'Mammals', 'Mammals', 'Birds']}
df_zoo = pd.DataFrame(dict_zoo)
You need to select only one column or some of the columns.
For example, you just want to get only the food data.
You can select by put [column name] right next to the DataFrame.
series_food = df_zoo['Food']
series_food
The code will return below.
When you select only one column of a DataFrame,
it will return a Series, not a subset of the DataFrame.
Still, it has the information of the index number and the column name.
series_food.index
series_food.name
If you want to convert the series into the DataFrame type again,
use .to_frame() method.
series_food.to_frame()
In addition to selecting a single column,
you might also want to get the data of what the animals are, too.
Put list of column names you want to select, into the [bracket].
First, make the list.
column_names = ['Name', 'Food']
Next, use the list to subset the DataFrame.
df_zoo[column_names]
# is equivalent to ... df_zoo[['Name', 'Food']]
Assigning the list to a variable is better,
because it makes the code cleaner and prevents you
from being confused between multiple [bracket]s.
Going back to #1,
You can use the same method in #2 in order to select a single column.
column_names = ['Food']
df_zoo[column_names]]
By doing this, you can skip the bothering step
of converting a series into DataFrame.
2022.11.11 - [분류 전체보기] - Select Pandas DataFrame Rows, Difference between .iloc and .loc
댓글 영역