Join Multiple Txt Files Into One File Using Pd.dataframe.join()
I have around 20 tab-delimited files like this: file1.txt CHROM POS REF ALT BOB chrA 15 C A 0.01 chrA 18 C A 0.01 chrA 19 C A 0.01 chrA 27 C A 0.0
Solution 1:
from pandas import DataFrame, read_csv, concat
from pathlib import Path
files = Path("files") # path to folder containing all .txt filesifnot files.exists():
raise ValueError("Path does not exist")
txt_files = files.glob("*.txt") # getting all txt files
result_df = DataFrame()
for txt_file in txt_files:
df = read_csv(txt_file, delimiter=" ", engine='python')
result_df = concat([result_df, df])
result_df = result_df.fillna('.')
result_df.to_csv('result.txt', sep=" ")
Output:
CHROM POS REF ALT JOHN SMITH
0 chrA 13 C A0.01 .
1 chrA 16 C A0.01 .
2 chrA 19 C A0.01 .
3 chrA 27 C A0.01 .
0 chrA 15 C A . 0.011 chrA 16 C A . 0.012 chrA 19 C A . 0.013 chrA 36 C A . 0.01
Notes:
I worked with only 2 .txt file in my example but it will work with n number of files
files given in example do not have a common delimiter to it !!
Just make sure that every word in your txt file is separated by " " i.e 3 blank spaces
basically each word given in example is not separated by tab
unfortunately pandas does not support exporting DataFrame to csv with more than 1 char long delimiter !!
this is how your txt should look like:
CHROM POS REF ALT JOHN
chrA 13 C A0.01
chrA 16 C A0.01
chrA 19 C A0.01
chrA 27 C A0.01
Post a Comment for "Join Multiple Txt Files Into One File Using Pd.dataframe.join()"