Pandas Read_excel Returning Nan For Cells Having Simple Formula If Excel File Is Created By Program
I use pd.read_excel to read a excel file which is created by openpyxl and downloaded from a url. The parsed dataframe will give nan if the cell value is a formula. # which formula
Solution 1:
not enough points to comment but this probably can help you:
Solution 2:
After doing some searches, I found my question may be duplicated with (or similar to):
and found more explanations from:
- python-openpyxl-read-xlsx-data-after-writing-on-existing-xlsx-with-formula
- openpyxl-data-only-gives-only-a-none-answer-when-storing-a-variable
- python-openpyxl-data-only-true-returning-none
- refresh-excel-external-data-with-python
Some notes (conclusions):
openpyxl
can write but doesn't caculate the excel formula, it just read cached value from last calculation by MS excel or other applications if possible withdata_only=True
arguments.- for solving this manually, like @Orlando's answer mentioned, open excel apps and save it (will automatically calculate/produce the formula results)
- for solving this programatically (with excel app installed), you just use
win32com
open and save it. (see this answer) - for solving this programatically (without excel app), you must calculate the results from excel formula string by yourself or some module like formulas, then set the caculated value back to cell (Warning: this will delete the formula) . If you also want to keep formula with default/cached value, you should use XlsxWriter which can write formula in cell with a default/cached value.
For me, because my formula is very simple, I use eval
like:
import openpyxl
wb = openpyxl.load_workbook('./test_formula2.xlsx')
ws = wb.active
ws.cell(2,2).value # '=100-1'eval(ws.cell(2,2).value[1:]) # slice after '=', e.g. 99
to get the calculated result.
Solution 3:
You can use formulas
The following snippet seems to work:
importformulasxl_model= formulas.ExcelModel().loads('test_formula.xlsx').finish()
xl_model.calculate()
xl_model.write(dirpath='.')
This will write a "TEST_FORMULA.XLSX" (all caps for some reason) file with calculated values in place of the formulas. Importantly, this does not rely on Excel.
Here is the formulas documentation if you need to dig into it.
Post a Comment for "Pandas Read_excel Returning Nan For Cells Having Simple Formula If Excel File Is Created By Program"