728x90
In [1]:
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
In [35]:
echo_df = pd.read_csv('Echodot2_Reviews.csv', encoding='utf-8')
echo_df.head()
Out[35]:
Rating | Review Date | Configuration Text | Review Text | Review Color | Title | User Verified | Review Useful Count | Declaration Text | Pageurl | |
---|---|---|---|---|---|---|---|---|---|---|
0 | 3 | 10/3/2017 | Echo Dot | Not great speakers | Black | Three Stars | Verified Purchase | NaN | NaN | https://www.amazon.com/All-New-Amazon-Echo-Dot... |
1 | 4 | 9/26/2017 | Echo Dot | Great little gagit | White | Four Stars | Verified Purchase | NaN | NaN | https://www.amazon.com/All-New-Amazon-Echo-Dot... |
2 | 5 | 9/8/2017 | Echo Dot | Awesome 👏🏽 | White | Awesome! | Verified Purchase | NaN | NaN | https://www.amazon.com/All-New-Amazon-Echo-Dot... |
3 | 5 | 10/19/2017 | Echo Dot | Love my Echo | Black | Five Stars | Verified Purchase | NaN | NaN | https://www.amazon.com/All-New-Amazon-Echo-Dot... |
4 | 5 | 9/17/2017 | Echo Dot | Great device | Black | Five Stars | Verified Purchase | NaN | NaN | https://www.amazon.com/All-New-Amazon-Echo-Dot... |
In [15]:
# echo_df.head(3)
# echo_df.tail(3)
In [36]:
echo_df.describe()
Out[36]:
Rating | Review Useful Count | |
---|---|---|
count | 6855.000000 | 28.000000 |
mean | 4.207002 | 17.071429 |
std | 1.272551 | 58.266265 |
min | 1.000000 | 2.000000 |
25% | 4.000000 | 2.000000 |
50% | 5.000000 | 2.000000 |
75% | 5.000000 | 2.000000 |
max | 5.000000 | 284.000000 |
In [37]:
echo_df.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 6855 entries, 0 to 6854
Data columns (total 10 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 Rating 6855 non-null int64
1 Review Date 6855 non-null object
2 Configuration Text 6855 non-null object
3 Review Text 6852 non-null object
4 Review Color 6855 non-null object
5 Title 6855 non-null object
6 User Verified 6641 non-null object
7 Review Useful Count 28 non-null float64
8 Declaration Text 6 non-null object
9 Pageurl 6855 non-null object
dtypes: float64(1), int64(1), object(8)
memory usage: 535.7+ KB
In [39]:
echo_df['Review Text']
Out[39]:
0 Not great speakers
1 Great little gagit
2 Awesome 👏🏽
3 Love my Echo
4 Great device
...
6850 This is so much fun! I love her.
6851 I'm having a lot of fun with it.
6852 I bought this as a gift for my husband and he ...
6853 I have now set Alexa up to control lights in m...
6854 What a shame, I tried one my friend has and wa...
Name: Review Text, Length: 6855, dtype: object
In [41]:
# str.lower()
echo_df['Review Text'].str.lower()
Out[41]:
0 not great speakers
1 great little gagit
2 awesome 👏🏽
3 love my echo
4 great device
...
6850 this is so much fun! i love her.
6851 i'm having a lot of fun with it.
6852 i bought this as a gift for my husband and he ...
6853 i have now set alexa up to control lights in m...
6854 what a shame, i tried one my friend has and wa...
Name: Review Text, Length: 6855, dtype: object
In [42]:
# str.upper()
echo_df['Review Text'].str.upper()
Out[42]:
0 NOT GREAT SPEAKERS
1 GREAT LITTLE GAGIT
2 AWESOME 👏🏽
3 LOVE MY ECHO
4 GREAT DEVICE
...
6850 THIS IS SO MUCH FUN! I LOVE HER.
6851 I'M HAVING A LOT OF FUN WITH IT.
6852 I BOUGHT THIS AS A GIFT FOR MY HUSBAND AND HE ...
6853 I HAVE NOW SET ALEXA UP TO CONTROL LIGHTS IN M...
6854 WHAT A SHAME, I TRIED ONE MY FRIEND HAS AND WA...
Name: Review Text, Length: 6855, dtype: object
In [44]:
# the headernames
echo_df.columns = echo_df.columns.str.lower()
echo_df.head()
Out[44]:
rating | review date | configuration text | review text | review color | title | user verified | review useful count | declaration text | pageurl | |
---|---|---|---|---|---|---|---|---|---|---|
0 | 3 | 10/3/2017 | Echo Dot | Not great speakers | Black | Three Stars | Verified Purchase | NaN | NaN | https://www.amazon.com/All-New-Amazon-Echo-Dot... |
1 | 4 | 9/26/2017 | Echo Dot | Great little gagit | White | Four Stars | Verified Purchase | NaN | NaN | https://www.amazon.com/All-New-Amazon-Echo-Dot... |
2 | 5 | 9/8/2017 | Echo Dot | Awesome 👏🏽 | White | Awesome! | Verified Purchase | NaN | NaN | https://www.amazon.com/All-New-Amazon-Echo-Dot... |
3 | 5 | 10/19/2017 | Echo Dot | Love my Echo | Black | Five Stars | Verified Purchase | NaN | NaN | https://www.amazon.com/All-New-Amazon-Echo-Dot... |
4 | 5 | 9/17/2017 | Echo Dot | Great device | Black | Five Stars | Verified Purchase | NaN | NaN | https://www.amazon.com/All-New-Amazon-Echo-Dot... |
In [46]:
echo_df['review text'].str.title()
Out[46]:
0 Not Great Speakers
1 Great Little Gagit
2 Awesome 👏🏽
3 Love My Echo
4 Great Device
...
6850 This Is So Much Fun! I Love Her.
6851 I'M Having A Lot Of Fun With It.
6852 I Bought This As A Gift For My Husband And He ...
6853 I Have Now Set Alexa Up To Control Lights In M...
6854 What A Shame, I Tried One My Friend Has And Wa...
Name: review text, Length: 6855, dtype: object
728x90
'Data Analytics with python > [Data Analysis]' 카테고리의 다른 글
[Text]S8_03_Text_in_pandas_2 (0) | 2023.01.21 |
---|---|
[Text]S8_02_Text_in_pandas_1 (0) | 2023.01.21 |
[datetime]S7_05_Practical_example3 (0) | 2023.01.21 |
[datetime]S7_04_Practical_example2 (0) | 2023.01.21 |
[datetime]S7_03_Practical_example1 (0) | 2023.01.21 |
댓글