Dataset Selective Picking And Transformation
I have a dataset in .xlsx with hundreds of thousands of rows as follow: slug    symbol  name    date    ranknow open    high    low close   volume  market  close_ratio spread compa
Solution 1:
In python using pandas, this should work.
import pandas as pd
df = pd.read_excel("/path/to/file/Book1.xlsx")
df = df.loc[:, ['symbol', 'name', 'date', 'close']]
df = df.set_index(['symbol', 'name', 'date'])
df = df.unstack(level=[0,1])
df = df['close']
to read the symbols file file and then filter out symbols not in the dataframe:
symbols = pd.read_csv('/path/to/file/symbols.txt', sep=" ", header=None)
symbols = symbols[0].tolist()
symbols = pd.Index(symbols).unique()
symbols = symbols.intersection(df.columns.get_level_values(0))
And the output will look like:
print(df[symbols])
symbol                   AAA        LA        YC
name                companyA    Lancer   Yocomin
date                                            
2018-09-01 00:00:00     None  0,422736      None
2018-10-01 00:00:00     None  0,487106      None
2018-11-01 00:00:00     None  0,331977      None
Post a Comment for "Dataset Selective Picking And Transformation"