Post
새소식
- Chirpy Theme 7.2.0 업데이트

Python DataFrame에서 행과 열 추가 및 삭제하기

행과 열로 구성되어 있는 데이터프레임 자료형에 데이터를 삽입과 삭제하는 방법을 알아 본다.

Python DataFrame에서 행과 열 추가 및 삭제하기

데이터프레임에서 행과 열 추가 및 삭제하기

Python

1
2
3
4
import pandas as pd
import seaborn as sns
penguins = sns.load_dataset('penguins')
df = penguins.copy()

행 추가 및 삭제

행 추가

pandas에서 행을 추가하는 방법은 크게 3가지가 있다1.

  • df.loc 사용
  • df.append() 사용
  • concat() 사용

df.loc 사용

index를 명시적으로 지정하여 행을 추가할 수 있다.

1
df.loc[[len(df.index)-1]]
 speciesislandbill_length_mmbill_depth_mmflipper_length_mmbody_mass_gsex
343GentooBiscoe49.916.1213.05400.0Male
1
2
df.loc[len(df.index)] = ['Adelie', 'Torgersen', 39.2, 19.3, 180.2, 8331, 'Male']
df.loc[len(df.index)-2:]
 speciesislandbill_length_mmbill_depth_mmflipper_length_mmbody_mass_gsex
343GentooBiscoe49.916.1213.05400.0Male
344AdelieTorgersen39.219.3180.28331.0Male

df.append() 사용

append()2 함수를 사용하여 행을 추가할 수 있다.

1
2
df_new = pd.Series(['Adelie', 'Torgersen', 39.2, 19.3, 180.2, 8331, 'Male'], index = df.columns)
df_new
1
2
3
4
5
6
7
8
species                 Adelie
island               Torgersen
bill_length_mm            39.2
bill_depth_mm             19.3
flipper_length_mm        180.2
body_mass_g               8331
sex                       Male
dtype: object
1
2
df = df.append(df_new, ignore_index=True)
df.tail()
 speciesislandbill_length_mmbill_depth_mmflipper_length_mmbody_mass_gsex
340GentooBiscoe46.814.3215.04850.0Female
341GentooBiscoe50.415.7222.05750.0Male
342GentooBiscoe45.214.8212.05200.0Female
343GentooBiscoe49.916.1213.05400.0Male
344AdelieTorgersen39.219.3180.28331.0Male

concat() 사용

concat()3 함수를 이용하여 행을 추가할 수 있다.

1
2
df_new = pd.DataFrame([['Adelie', 'Torgersen', 39.2, 19.3, 180.2, 8331, 'Male']], columns = df.columns)
pd.concat([df, df_new], ignore_index = True).tail()
 speciesislandbill_length_mmbill_depth_mmflipper_length_mmbody_mass_gsex
340GentooBiscoe46.814.3215.04850.0Female
341GentooBiscoe50.415.7222.05750.0Male
342GentooBiscoe45.214.8212.05200.0Female
343GentooBiscoe49.916.1213.05400.0Male
344AdelieTorgersen39.219.3180.28331.0Male
행 삭제

DataFrame에서 데이터를 삭제하는 방법은 drop() 함수를 이용하는 것이다45.

DataFrame.drop(labels=None, axis=0, index=None, columns=None, level=None, inplace=False, errors=’raise’)

1
df.drop(df.index[(1,3), ]).head()
 speciesislandbill_length_mmbill_depth_mmflipper_length_mmbody_mass_gsex
0AdelieTorgersen39.118.7181.03750.0Male
2AdelieTorgersen40.318.0195.03250.0Female
4AdelieTorgersen36.719.3193.03450.0Female
5AdelieTorgersen39.320.6190.03650.0Male
6AdelieTorgersen38.917.8181.03625.0Female
1
df.drop(df.index[100:]).tail()
 speciesislandbill_length_mmbill_depth_mmflipper_length_mmbody_mass_gsex
95AdelieDream40.818.9208.04300.0Male
96AdelieDream38.118.6190.03700.0Female
97AdelieDream40.318.5196.04350.0Male
98AdelieDream33.116.1178.02900.0Female
99AdelieDream43.218.5192.04100.0Male
1
df.drop(df[df['species']=='Adelie'].index).head()
 speciesislandbill_length_mmbill_depth_mmflipper_length_mmbody_mass_gsex
152ChinstrapDream46.517.9192.03500.0Female
153ChinstrapDream50.019.5196.03900.0Male
154ChinstrapDream51.319.2193.03650.0Male
155ChinstrapDream45.418.7188.03525.0Female
156ChinstrapDream52.719.8197.03725.0Male

열 추가 및 삭제

열 추가
1
2
3
4
5
import pandas as pd
import seaborn as sns
df = sns.load_dataset("penguins")

df.head()
 speciesislandbill_length_mmbill_depth_mmflipper_length_mmbody_mass_gsex
 AdelieTorgersen39.118.7181.03750.0Male
 AdelieTorgersen39.517.4186.03800.0Female
 AdelieTorgersen40.318.0195.03250.0Female
 AdelieTorgersenNaNNaNNaNNaNNaN
 AdelieTorgersen36.719.3193.03450.0Female
1
2
bill_ratio = df['bill_length_mm']/df['bill_depth_mm']
bill_ratio
1
2
3
4
5
6
7
8
9
10
11
12
0      2.090909
1      2.270115
2      2.238889
3           NaN
4      1.901554
         ...   
339         NaN
340    3.272727
341    3.210191
342    3.054054
343    3.099379
Length: 344, dtype: float64
1
2
df['bill_ratio'] = bill_ratio
df.head()
 speciesislandbill_length_mmbill_depth_mmflipper_length_mmbody_mass_gsexbill_ratio
 AdelieTorgersen39.118.7181.03750.0Male2.090909
 AdelieTorgersen39.517.4186.03800.0Female2.270115
 AdelieTorgersen40.318.0195.03250.0Female2.238889
 AdelieTorgersenNaNNaNNaNNaNNaNNaN
 AdelieTorgersen36.719.3193.03450.0Female1.901554
열 삭제
1
2
df.drop(columns = ['sex', 'bill_ratio'], inplace=True)
df.head()
 speciesislandbill_length_mmbill_depth_mmflipper_length_mmbody_mass_g
 AdelieTorgersen39.118.7181.03750.0
 AdelieTorgersen39.517.4186.03800.0
 AdelieTorgersen40.318.0195.03250.0
 AdelieTorgersenNaNNaNNaNNaN
 AdelieTorgersen36.719.3193.03450.0

원본 데이터에 반영이 필요한 경우 inplace=True 옵션을 적용한다.

참고자료

This post is licensed under CC BY 4.0 by the author.