I have a simple question relating to creating 'case when' SQL equivalent in Python. I have a list of twitter statements from which I need to extract and categorize windows version in new column.
Currently I created below function, but it does not work creating error message: ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
if df['twitter'].str.lower().str.contains('windows 11'):
return 'windows 11'
elif df['twitter'].str.lower().str.contains('windows 10') :
return 'windows 10'
return 'windows 8 or older'
df['windows_version'] = df['twitter'].apply(windows_ver)
Can you please advise how may I correctly create new column 'windows_version' categorizing tweets relating to particular system? Thank you in advance for help!
Answer
Consider the recently added pandas.Series.case_when
method.
df['windows_version'] = df['twitter'].case_when(
[
(df['twitter'].str.lower().str.contains('windows 11'), 'windows 11'),
(df['twitter'].str.lower().str.contains('windows 10'), 'windows 10'),
(pd.Series(True), 'windows 8 or older')
]
)