How To Prepare Large Datasets With Patsy's Api?
I'm running a logistic regression and having trouble using Patsy's API to prepare the data when it is bigger than a small sample. Using the dmatrices function directly on a DataFra
Solution 1:
y
and dta
are DesignInfo
objects -- they encode all the information needed to take a row of a data frame and convert it to a row of a design matrix. They do not, though, have your actual data in them -- to get a piece of your design matrix, you have to give them a piece of your data. To use them, you need to do something like
for data_chunk in iter_maker():
y_chunk, design_chunk = dmatrices((y, dta), data_chunk,
NA_action="drop", return_type="dataframe")
# do something with y_chunk and design_chunk
# ...
Post a Comment for "How To Prepare Large Datasets With Patsy's Api?"