Skip to content Skip to sidebar Skip to footer

How To Read Text File's Key, Value Pair Using Pandas?

I want to parse one text file which contains following data. Input.txt- 1=88|11=1438|15=KKK|45=7.7|45=00|21=66|86=a 4=13|4=1388|49=DDD|8=157.73|67=00|45=08|84=b|45=k 6=84|41=18|56=

Solution 1:

You can first read_csv with separator which is not in data e.g. ;, then double split with stack:

import pandas as pd
import numpy as np
import io

temp=u"""1=88|11=1438|15=KKK|45=7.7|45=00|21=66|86=a
4=13|4=1388|49=DDD|8=157.73|67=00|45=08|84=b|45=k
6=84|41=18|56=TTT|67=1.2|4=21|45=78|07=d
"""#after testing replace io.StringIO(temp) to filename
df = pd.read_csv(io.StringIO(temp), sep=";", index_col=None, names=['text'])

print (df)
                                                text
01=88|11=1438|15=KKK|45=7.7|45=00|21=66|86=a
14=13|4=1388|49=DDD|8=157.73|67=00|45=08|84=b|45=k
26=84|41=18|56=TTT|67=1.2|4=21|45=78|07=d
s = df.text.str.split('|', expand=True).stack().str.split('=', expand=True)
print (s)
      01001881111438215     KKK
  3457.74450052166686       a
10413141388249     DDD
  38157.7346700545      08
  684       b
  745       k
2068414118256     TTT
  3671.24421545786  07       d
dfs = [g.set_index(0).rename_axis(None) for i, g in s.groupby(level=0)]
print (dfs[0])
       118811143815   KKK
457.74500216686     a
fori, gins.groupby(level=0):
    print (g.set_index(0).rename_axis(None))
       118811143815KKK457.74500216686a14134138849DDD8157.736700450884b45k1684411856TTT671.2421457807d

EDIT by comment:

If need write to file s, use to_csv:

s.to_csv('file.txt', header=None, index=None, sep='\t')  

EDIT1 by comment:

You can set column name to empty string and remove index name by rename_axis (new in pandas0.18.0), but more common is set column name to some text (e.g. s.columns = ['idx','a']):

s = df.text.str.split('|', expand=True).stack().str.split('=', expand=True)
s.columns = ['idx','']
print (s)
    idx        
001881111438215     KKK
  3457.74450052166686       a
10413141388249     DDD
  38157.7346700545      08
  684       b
  745       k
2068414118256     TTT
  3671.24421545786  07       d
dfs = [g.set_index('idx').rename_axis(None) for i, g in s.groupby(level=0)]
print (dfs[0])
18811143815   KKK
457.74500216686     a

Post a Comment for "How To Read Text File's Key, Value Pair Using Pandas?"