Skip to content Skip to sidebar Skip to footer

Join Multiple Txt Files Into One File Using Pd.dataframe.join()

I have around 20 tab-delimited files like this: file1.txt CHROM POS REF ALT BOB chrA 15 C A 0.01 chrA 18 C A 0.01 chrA 19 C A 0.01 chrA 27 C A 0.0

Solution 1:

from pandas import DataFrame, read_csv, concat
from pathlib import Path

files = Path("files") # path to folder containing all .txt filesifnot files.exists():
    raise ValueError("Path does not exist")
    
txt_files = files.glob("*.txt") # getting all txt files
   
result_df = DataFrame()
for txt_file in txt_files:
    df = read_csv(txt_file, delimiter="   ", engine='python')
    result_df = concat([result_df, df])
    
result_df = result_df.fillna('.')
result_df.to_csv('result.txt', sep=" ")

Output:

 CHROM POS REF ALT JOHN SMITH
0 chrA 13 C A0.01 .
1 chrA 16 C A0.01 .
2 chrA 19 C A0.01 .
3 chrA 27 C A0.01 .
0 chrA 15 C A . 0.011 chrA 16 C A . 0.012 chrA 19 C A . 0.013 chrA 36 C A . 0.01

Notes:

I worked with only 2 .txt file in my example but it will work with n number of files files given in example do not have a common delimiter to it !! Just make sure that every word in your txt file is separated by " " i.e 3 blank spacesbasically each word given in example is not separated by tabunfortunately pandas does not support exporting DataFrame to csv with more than 1 char long delimiter !!

this is how your txt should look like:

CHROM   POS   REF   ALT   JOHN
chrA   13   C   A0.01
chrA   16   C   A0.01
chrA   19   C   A0.01
chrA   27   C   A0.01

Post a Comment for "Join Multiple Txt Files Into One File Using Pd.dataframe.join()"