Skip to content Skip to sidebar Skip to footer

Rename Multiple Columns Of Pandas Dataframe Based On Condition

I have a df in which I need to rename 40 column names to empty string. this can be achieved by using .rename(), but I need to provide all the column names in dict, which needs to b

Solution 1:

If you want to stick with rename:

def renaming_fun(x):
    if "NULL" in x or "UNNAMED" in x:
        return "" # or None
    return x

df = df.rename(columns=renaming_fun)

It can be handy if the mapping function gets more complex. Otherwise, list comprehensions will do:

df.columns = [renaming_fun(col) for col in cols]

Another possibility:

df.columns = map(renaming_fun, df.columns)

But as it was already mentioned, renaming with empty strings is not something you would usually do.


Solution 2:

Is it possible, but be carefull - then if need select one empty column get all empty columns, because duplicated columns names:

print (df[''])

0  1  11  41  51
1  2  22  42  52
2  3  33  43  53
3  4  44  44  54

Use startswith for get all columns by tuples in list comprehension:

df.columns = ['' if c.startswith(('NULL','UNNAMED')) else c for c in df.columns]

Your solution should be changed:

d = dict.fromkeys(df.columns[df.columns.str.startswith(('NULL','UNNAMED'))], '')
print (d)
{'NULL1': '', 'NULL2': '', 'UNNAMED1': '', 'UNNAMED2': ''}
df = df.rename(columns=d)

Solution 3:

You can use dict comprehension inside df.rename():

idx_filter = np.asarray([i for i, col in enumerate(df.columns) if SOME_STRING_CONDITION in col])
df.rename(columns={col: '' for col in df.columns[idx_filter]}, inplace=True)

In your case, it sounds like SOME_STRING_CONDITION would be 'NULL' or 'UNNAMED'.

I figured this out while looking for help on a thread for a more generic column renaming issue (Renaming columns in pandas) for a problem of my own. I didn't have enough reputation to add my solution as an answer or comment (I'm new-ish on stackoverflow), so I am posting it here!

This solution is also helpful if you need to keep part of the string that you were filtering for. For example, if you wanted to replace the "C" columns with "col_":

idx_filter = np.asarray([i for i, col in enumerate(df.columns) if 'C' in col])
df.rename(columns={col: col.replace('C', 'col_') for col in df.columns[idx_filter]}, inplace=True)

Solution 4:

If you have few columns to retain their name. Use list-comprehension as below:

df.columns = [col if col in ('C1','C2') else "" for col in df.columns]

Solution 5:

df.columns = [col ifNULLnot in col else “” for col in df.columns]

This should work, since you can change the column names by assinging list to the dataframe column variable.


Post a Comment for "Rename Multiple Columns Of Pandas Dataframe Based On Condition"