Skip to content Skip to sidebar Skip to footer

Remove -#### In Zipcodes

How do I remove the +4 from zipcodes, in python? I've got data like 85001 52804-3233 Winston-Salem And I want that to become 85001 52804 Winston-Salem

Solution 1:

>>>zip = '52804-3233'>>>zip[:5]
'52804'

...and of course when you parse your lines from the original data you should insert some kind of rule to distinguish between zipcode to fix and other strings, but I don't know how your data looks like, so I can't help much (you could check if they are only digits and the '-' symbol, maybe?).

Solution 2:

>>>import re>>>s = "52804-3233">>># regex to remove a dash and 4 digits after the dash after 5 digits:>>>re.sub('(\d{5})-\d{4}', '\\1', s)
'52804'

The \\1 is a so called back reference and gets replaced by the first group, which would be the 5 digit zipcode in this case.

Solution 3:

You could try something like this:

forinputin inputs:
    ifinput[:5].isnumeric():
        input = input[:5]
        # Takes the first 5 characters from the string 

Just take away the first 5 characters of anything that is numbers in the first 5 positions.

Solution 4:

re.sub('-\d{4}$', '', zipcode)

Solution 5:

This grabs all items of the format 00000-0000 with a space or other word boundary before and after the number and replaces it with the first five digits. The other regex's posted will match some other number formats that you might not want.

re.sub('\b(\d{5})-\d{4}\b', '\\1', zipcode)

Post a Comment for "Remove -#### In Zipcodes"