728x90
In [26]:
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
In [27]:
echo_df = pd.read_csv('Echodot2_Reviews.csv', encoding='utf-8')
echo_df.head()
Out[27]:
Rating | Review Date | Configuration Text | Review Text | Review Color | Title | User Verified | Review Useful Count | Declaration Text | Pageurl | |
---|---|---|---|---|---|---|---|---|---|---|
0 | 3 | 10/3/2017 | Echo Dot | Not great speakers | Black | Three Stars | Verified Purchase | NaN | NaN | https://www.amazon.com/All-New-Amazon-Echo-Dot... |
1 | 4 | 9/26/2017 | Echo Dot | Great little gagit | White | Four Stars | Verified Purchase | NaN | NaN | https://www.amazon.com/All-New-Amazon-Echo-Dot... |
2 | 5 | 9/8/2017 | Echo Dot | Awesome 👏🏽 | White | Awesome! | Verified Purchase | NaN | NaN | https://www.amazon.com/All-New-Amazon-Echo-Dot... |
3 | 5 | 10/19/2017 | Echo Dot | Love my Echo | Black | Five Stars | Verified Purchase | NaN | NaN | https://www.amazon.com/All-New-Amazon-Echo-Dot... |
4 | 5 | 9/17/2017 | Echo Dot | Great device | Black | Five Stars | Verified Purchase | NaN | NaN | https://www.amazon.com/All-New-Amazon-Echo-Dot... |
In [28]:
# replace elements
echo_df['Review Color'] = echo_df['Review Color'].str.replace('Black', 'Dark Black')
echo_df.head()
Out[28]:
Rating | Review Date | Configuration Text | Review Text | Review Color | Title | User Verified | Review Useful Count | Declaration Text | Pageurl | |
---|---|---|---|---|---|---|---|---|---|---|
0 | 3 | 10/3/2017 | Echo Dot | Not great speakers | Dark Black | Three Stars | Verified Purchase | NaN | NaN | https://www.amazon.com/All-New-Amazon-Echo-Dot... |
1 | 4 | 9/26/2017 | Echo Dot | Great little gagit | White | Four Stars | Verified Purchase | NaN | NaN | https://www.amazon.com/All-New-Amazon-Echo-Dot... |
2 | 5 | 9/8/2017 | Echo Dot | Awesome 👏🏽 | White | Awesome! | Verified Purchase | NaN | NaN | https://www.amazon.com/All-New-Amazon-Echo-Dot... |
3 | 5 | 10/19/2017 | Echo Dot | Love my Echo | Dark Black | Five Stars | Verified Purchase | NaN | NaN | https://www.amazon.com/All-New-Amazon-Echo-Dot... |
4 | 5 | 9/17/2017 | Echo Dot | Great device | Dark Black | Five Stars | Verified Purchase | NaN | NaN | https://www.amazon.com/All-New-Amazon-Echo-Dot... |
In [29]:
echo_df.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 6855 entries, 0 to 6854
Data columns (total 10 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 Rating 6855 non-null int64
1 Review Date 6855 non-null object
2 Configuration Text 6855 non-null object
3 Review Text 6852 non-null object
4 Review Color 6855 non-null object
5 Title 6855 non-null object
6 User Verified 6641 non-null object
7 Review Useful Count 28 non-null float64
8 Declaration Text 6 non-null object
9 Pageurl 6855 non-null object
dtypes: float64(1), int64(1), object(8)
memory usage: 535.7+ KB
In [30]:
echo_df.dropna(subset=['Review Text'], how='any', axis = 0, inplace=True)
In [31]:
# ends with the word 'love'
# method chaining: 코드의 내용을 다르게 써주는 대신에 한 곳에 모든 코드를 입력할 수 있다.
mask = echo_df['Review Text'].str.lower().str.endswith('love')
echo_df[mask]
Out[31]:
Rating | Review Date | Configuration Text | Review Text | Review Color | Title | User Verified | Review Useful Count | Declaration Text | Pageurl | |
---|---|---|---|---|---|---|---|---|---|---|
3103 | 5 | 9/28/2017 | Echo Dot | LOVE | Dark Black | Five Stars | Verified Purchase | NaN | NaN | https://www.amazon.com/All-New-Amazon-Echo-Dot... |
3160 | 5 | 9/13/2017 | Echo Dot | What's not to love | Dark Black | Five Stars | Verified Purchase | NaN | NaN | https://www.amazon.com/All-New-Amazon-Echo-Dot... |
6757 | 5 | 9/30/2017 | Echo Dot | Love love love | White | Five Stars | Verified Purchase | NaN | NaN | https://www.amazon.com/All-New-Amazon-Echo-Dot... |
In [32]:
# starts with the word 'love'
mask = echo_df['Review Text'].str.lower().str.startswith('love')
echo_df[mask]
Out[32]:
Rating | Review Date | Configuration Text | Review Text | Review Color | Title | User Verified | Review Useful Count | Declaration Text | Pageurl | |
---|---|---|---|---|---|---|---|---|---|---|
3 | 5 | 10/19/2017 | Echo Dot | Love my Echo | Dark Black | Five Stars | Verified Purchase | NaN | NaN | https://www.amazon.com/All-New-Amazon-Echo-Dot... |
20 | 5 | 9/8/2017 | Echo Dot | Love the echo! | Dark Black | This is the second one for the house. | Verified Purchase | NaN | NaN | https://www.amazon.com/All-New-Amazon-Echo-Dot... |
30 | 5 | 10/8/2017 | Echo Dot | Loved | Dark Black | Five Stars | Verified Purchase | NaN | NaN | https://www.amazon.com/All-New-Amazon-Echo-Dot... |
31 | 5 | 10/22/2017 | Echo Dot | Love the product! nice alexa! | White | Very nice item! | NaN | NaN | NaN | https://www.amazon.com/All-New-Amazon-Echo-Dot... |
38 | 5 | 9/1/2017 | Echo Dot | Love this little gadget so much I got my mom o... | Dark Black | Keeping Tabs on my mom! | Verified Purchase | NaN | NaN | https://www.amazon.com/All-New-Amazon-Echo-Dot... |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
6821 | 5 | 10/8/2017 | Echo Dot | Love it | Dark Black | Five Stars | Verified Purchase | NaN | NaN | https://www.amazon.com/All-New-Amazon-Echo-Dot... |
6837 | 5 | 9/11/2017 | Echo Dot | Love having 2 | White | Five Stars | Verified Purchase | NaN | NaN | https://www.amazon.com/All-New-Amazon-Echo-Dot... |
6838 | 5 | 9/7/2017 | Echo Dot | Love love love love it. Cut my cable bill by a... | Dark Black | LOVE MY ECHO DOT!! | Verified Purchase | NaN | NaN | https://www.amazon.com/All-New-Amazon-Echo-Dot... |
6844 | 5 | 9/1/2017 | Echo Dot | Love this little gadget so much I got my mom o... | Dark Black | Keeping Tabs on my mom! | Verified Purchase | NaN | NaN | https://www.amazon.com/All-New-Amazon-Echo-Dot... |
6849 | 5 | 9/14/2017 | Echo Dot | Love it. | Dark Black | Five Stars | Verified Purchase | NaN | NaN | https://www.amazon.com/All-New-Amazon-Echo-Dot... |
721 rows × 10 columns
In [36]:
# contains the word 'love'
mask = echo_df['Review Text'].str.lower().str.contains('love')
echo_df[mask]
Out[36]:
Rating | Review Date | Configuration Text | Review Text | Review Color | Title | User Verified | Review Useful Count | Declaration Text | Pageurl | |
---|---|---|---|---|---|---|---|---|---|---|
3 | 5 | 10/19/2017 | Echo Dot | Love my Echo | Dark Black | Five Stars | Verified Purchase | NaN | NaN | https://www.amazon.com/All-New-Amazon-Echo-Dot... |
11 | 5 | 10/9/2017 | Echo Dot | Alexa...You rock!! OMG people. I am not a tech... | Dark Black | This is the greatest thing since chocolate | Verified Purchase | NaN | NaN | https://www.amazon.com/All-New-Amazon-Echo-Dot... |
13 | 5 | 9/20/2017 | Echo Dot | I love using Alexa with the smart outlets for ... | White | Love it! | Verified Purchase | NaN | NaN | https://www.amazon.com/All-New-Amazon-Echo-Dot... |
17 | 5 | 10/8/2017 | Echo Dot | Cant say enough !!I love my DOT!!!! | Dark Black | Happy with her Dot | Verified Purchase | NaN | NaN | https://www.amazon.com/All-New-Amazon-Echo-Dot... |
20 | 5 | 9/8/2017 | Echo Dot | Love the echo! | Dark Black | This is the second one for the house. | Verified Purchase | NaN | NaN | https://www.amazon.com/All-New-Amazon-Echo-Dot... |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
6845 | 5 | 9/13/2017 | Echo Dot | We now have 4 Dots & one Show in the house. Pe... | White | Perfect for everyone in our family | Verified Purchase | NaN | NaN | https://www.amazon.com/All-New-Amazon-Echo-Dot... |
6846 | 5 | 9/4/2017 | Echo Dot | Alexa is exceptional, I am getting use to ever... | Dark Black | From what I know so far I love it. | Verified Purchase | NaN | NaN | https://www.amazon.com/All-New-Amazon-Echo-Dot... |
6849 | 5 | 9/14/2017 | Echo Dot | Love it. | Dark Black | Five Stars | Verified Purchase | NaN | NaN | https://www.amazon.com/All-New-Amazon-Echo-Dot... |
6850 | 5 | 9/17/2017 | Echo Dot | This is so much fun! I love her. | Dark Black | In love with Alexa!! | Verified Purchase | NaN | NaN | https://www.amazon.com/All-New-Amazon-Echo-Dot... |
6853 | 5 | 9/27/2017 | Echo Dot | I have now set Alexa up to control lights in m... | Dark Black | Simply fabulous! | Verified Purchase | NaN | NaN | https://www.amazon.com/All-New-Amazon-Echo-Dot... |
1963 rows × 10 columns
In [33]:
# split the string into a list
echo_df['Review Text'].str.split()
Out[33]:
0 [Not, great, speakers]
1 [Great, little, gagit]
2 [Awesome, 👏🏽]
3 [Love, my, Echo]
4 [Great, device]
...
6850 [This, is, so, much, fun!, I, love, her.]
6851 [I'm, having, a, lot, of, fun, with, it.]
6852 [I, bought, this, as, a, gift, for, my, husban...
6853 [I, have, now, set, Alexa, up, to, control, li...
6854 [What, a, shame,, I, tried, one, my, friend, h...
Name: Review Text, Length: 6852, dtype: object
In [34]:
# select the index within the extracted list
# the first element
echo_df['Review Text'].str.split(' ').str.get(0)
Out[34]:
0 Not
1 Great
2 Awesome
3 Love
4 Great
...
6850 This
6851 I'm
6852 I
6853 I
6854 What
Name: Review Text, Length: 6852, dtype: object
728x90
'Data Analytics with python > [Data Analysis]' 카테고리의 다른 글
[Text]S8_06_Text_tokenization (0) | 2023.01.21 |
---|---|
[Text]S8_04_Text_cleaning(removing_punctuation) (0) | 2023.01.21 |
[Text]S8_02_Text_in_pandas_1 (0) | 2023.01.21 |
[Text]S8_01_upper_lower (0) | 2023.01.21 |
[datetime]S7_05_Practical_example3 (0) | 2023.01.21 |
댓글