728x90
In [1]:
import pandas as pd
In [3]:
bank_df = pd.read_csv('bank customers.csv'); bank_df
Out[3]:
RowNumber | CustomerId | Surname | CreditScore | Geography | Gender | Age | Tenure | Balance | NumOfProducts | HasCrCard | IsActiveMember | EstimatedSalary | Exited | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 1 | 15634602 | Hargrave | 619 | France | Female | 42 | 2 | 0.00 | 1 | 1 | 1 | 101348.88 | 1 |
1 | 2 | 15647311 | Hill | 608 | Spain | Female | 41 | 1 | 83807.86 | 1 | 0 | 1 | 112542.58 | 0 |
2 | 3 | 15619304 | Onio | 502 | France | Female | 42 | 8 | 159660.80 | 3 | 1 | 0 | 113931.57 | 1 |
3 | 4 | 15701354 | Boni | 699 | France | Female | 39 | 1 | 0.00 | 2 | 0 | 0 | 93826.63 | 0 |
4 | 5 | 15737888 | Mitchell | 850 | Spain | Female | 43 | 2 | 125510.82 | 1 | 1 | 1 | 79084.10 | 0 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
9995 | 9996 | 15606229 | Obijiaku | 771 | France | Male | 39 | 5 | 0.00 | 2 | 1 | 0 | 96270.64 | 0 |
9996 | 9997 | 15569892 | Johnstone | 516 | France | Male | 35 | 10 | 57369.61 | 1 | 1 | 1 | 101699.77 | 0 |
9997 | 9998 | 15584532 | Liu | 709 | France | Female | 36 | 7 | 0.00 | 1 | 0 | 1 | 42085.58 | 1 |
9998 | 9999 | 15682355 | Sabbatini | 772 | Germany | Male | 42 | 3 | 75075.31 | 2 | 1 | 0 | 92888.52 | 1 |
9999 | 10000 | 15628319 | Walker | 792 | France | Female | 28 | 4 | 130142.79 | 1 | 1 | 0 | 38190.78 | 0 |
10000 rows × 14 columns
In [5]:
# Return a column
# The output will be a Pandas Series
sample = bank_df["Surname"]; sample
Out[5]:
0 Hargrave
1 Hill
2 Onio
3 Boni
4 Mitchell
...
9995 Obijiaku
9996 Johnstone
9997 Liu
9998 Sabbatini
9999 Walker
Name: Surname, Length: 10000, dtype: object
In [6]:
# Alternatively, using dot
# 컬럼 이름에 spaces가 있다면 작동되지 않는다.
bank_df.Surname
Out[6]:
0 Hargrave
1 Hill
2 Onio
3 Boni
4 Mitchell
...
9995 Obijiaku
9996 Johnstone
9997 Liu
9998 Sabbatini
9999 Walker
Name: Surname, Length: 10000, dtype: object
In [7]:
# select multiple columns
# A list containing all column names that you would like to select
sample = bank_df[ ['Surname','Age','Balance'] ]
sample
Out[7]:
Surname | Age | Balance | |
---|---|---|---|
0 | Hargrave | 42 | 0.00 |
1 | Hill | 41 | 83807.86 |
2 | Onio | 42 | 159660.80 |
3 | Boni | 39 | 0.00 |
4 | Mitchell | 43 | 125510.82 |
... | ... | ... | ... |
9995 | Obijiaku | 39 | 0.00 |
9996 | Johnstone | 35 | 57369.61 |
9997 | Liu | 36 | 0.00 |
9998 | Sabbatini | 42 | 75075.31 |
9999 | Walker | 28 | 130142.79 |
10000 rows × 3 columns
In [9]:
# Alternatively, A list first and then use it to select columns
selected_columns = ['Surname','Balance']
sample = bank_df[selected_columns]
sample
Out[9]:
Surname | Balance | |
---|---|---|
0 | Hargrave | 0.00 |
1 | Hill | 83807.86 |
2 | Onio | 159660.80 |
3 | Boni | 0.00 |
4 | Mitchell | 125510.82 |
... | ... | ... |
9995 | Obijiaku | 0.00 |
9996 | Johnstone | 57369.61 |
9997 | Liu | 0.00 |
9998 | Sabbatini | 75075.31 |
9999 | Walker | 130142.79 |
10000 rows × 2 columns
In [10]:
# Accesing a given row
bank_df[0:2]
Out[10]:
RowNumber | CustomerId | Surname | CreditScore | Geography | Gender | Age | Tenure | Balance | NumOfProducts | HasCrCard | IsActiveMember | EstimatedSalary | Exited | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 1 | 15634602 | Hargrave | 619 | France | Female | 42 | 2 | 0.00 | 1 | 1 | 1 | 101348.88 | 1 |
1 | 2 | 15647311 | Hill | 608 | Spain | Female | 41 | 1 | 83807.86 | 1 | 0 | 1 | 112542.58 | 0 |
728x90
'Data Analytics with python > [Data Analysis]' 카테고리의 다른 글
[Pandas][DataFrame]S2_07_Label_Based_elements_selection (0) | 2023.01.21 |
---|---|
[Pandas][DataFrame]S2_06_Column_ADDING_DELETING (0) | 2023.01.21 |
[Pandas][DataFrame]S2_04_index_setting (0) | 2023.01.21 |
[Pandas][DataFrame]S2_03_Outputs (0) | 2023.01.21 |
[Pandas][DataFrame]S2_02_Inputs (0) | 2023.01.21 |
댓글