상세 컨텐츠

본문 제목

Selecting Columns DataFrame Pandas

카테고리 없음

by Panda Kim 2022. 11. 12. 16:39

본문

0. How to select rows from a DataFrame?


1. Select a Single Column

Let's say you have a DataFrame of 4 animals like below.

The codes below create the DataFrame.

import pandas as pd

dict_zoo = {'Name' : ['Panda', 'Rabbit', 'Lion', 'Duck'],
            'Food' : ['Carrot', 'Bamboo', 'Meat', 'Fish'],
            'Age' : [5, 2, 7, 3],
            'Species' : ['Mammals', 'Mammals', 'Mammals', 'Birds']}
            
df_zoo = pd.DataFrame(dict_zoo)

You need to select only one column or some of the columns.

For example, you just want to get only the food data.


You can select by put [column name] right next to the DataFrame.

series_food = df_zoo['Food']
series_food

The code will return below.

When you select only one column of a DataFrame,

it will return a Series, not a subset of the DataFrame.


Still, it has the information of the index number and the column name.

series_food.index

series_food.name


If you want to convert the series into the DataFrame type again,

use .to_frame() method.

series_food.to_frame()


2. Select Multiple Columns of DataFrame

In addition to selecting a single column,

you might also want to get the data of what the animals are, too.


Put list of column names you want to select, into the [bracket].

 

First, make the list.

column_names = ['Name', 'Food']

Next, use the list to subset the DataFrame.

df_zoo[column_names]
# is equivalent to ... df_zoo[['Name', 'Food']]

Assigning the list to a variable is better,

because it makes the code cleaner and prevents you

from being confused between multiple [bracket]s.


Going back to #1,

You can use the same method in #2 in order to select a single column.

column_names = ['Food']
df_zoo[column_names]]

By doing this, you can skip the bothering step

of converting a series into DataFrame.

 

 

2022.11.11 - [분류 전체보기] - Select Pandas DataFrame Rows, Difference between .iloc and .loc

댓글 영역