Rename Multiple Columns Of Pandas Dataframe Based On Condition
Solution 1:
If you want to stick with rename
:
def renaming_fun(x):
if "NULL" in x or "UNNAMED" in x:
return "" # or None
return x
df = df.rename(columns=renaming_fun)
It can be handy if the mapping function gets more complex. Otherwise, list comprehensions will do:
df.columns = [renaming_fun(col) for col in cols]
Another possibility:
df.columns = map(renaming_fun, df.columns)
But as it was already mentioned, renaming with empty strings is not something you would usually do.
Solution 2:
Is it possible, but be carefull - then if need select one empty column get all empty columns, because duplicated columns names:
print (df[''])
0 1 11 41 51
1 2 22 42 52
2 3 33 43 53
3 4 44 44 54
Use startswith
for get all columns by tuples in list comprehension:
df.columns = ['' if c.startswith(('NULL','UNNAMED')) else c for c in df.columns]
Your solution should be changed:
d = dict.fromkeys(df.columns[df.columns.str.startswith(('NULL','UNNAMED'))], '')
print (d)
{'NULL1': '', 'NULL2': '', 'UNNAMED1': '', 'UNNAMED2': ''}
df = df.rename(columns=d)
Solution 3:
You can use dict comprehension inside df.rename():
idx_filter = np.asarray([i for i, col in enumerate(df.columns) if SOME_STRING_CONDITION in col])
df.rename(columns={col: '' for col in df.columns[idx_filter]}, inplace=True)
In your case, it sounds like SOME_STRING_CONDITION would be 'NULL' or 'UNNAMED'.
I figured this out while looking for help on a thread for a more generic column renaming issue (Renaming columns in pandas) for a problem of my own. I didn't have enough reputation to add my solution as an answer or comment (I'm new-ish on stackoverflow), so I am posting it here!
This solution is also helpful if you need to keep part of the string that you were filtering for. For example, if you wanted to replace the "C" columns with "col_":
idx_filter = np.asarray([i for i, col in enumerate(df.columns) if 'C' in col])
df.rename(columns={col: col.replace('C', 'col_') for col in df.columns[idx_filter]}, inplace=True)
Solution 4:
If you have few columns to retain their name. Use list-comprehension
as below:
df.columns = [col if col in ('C1','C2') else "" for col in df.columns]
Solution 5:
df.columns = [col if “NULL” not in col else “” for col in df.columns]
This should work, since you can change the column names by assinging list to the dataframe column variable.
Post a Comment for "Rename Multiple Columns Of Pandas Dataframe Based On Condition"