In Python 2.7 Why Are Strings Written Faster In Text Mode Than In Binary Mode?
Solution 1:
Looking at the source code for file.write
reveals the following difference between binary mode and text mode:
if (f->f_binary) {
if (!PyArg_ParseTuple(args, "s*", &pbuf))
returnNULL;
s = pbuf.buf;
n = pbuf.len;
}
else {
PyObject *text;
if (!PyArg_ParseTuple(args, "O", &text))
returnNULL;
if (PyString_Check(text)) {
s = PyString_AS_STRING(text);
n = PyString_GET_SIZE(text);
}
Here f->f_binary
is set when the mode for open
includes "b"
. In this case Python constructs an auxiliary buffer object from the string object and then gets the data s
and length n
from that buffer. I suppose this is for compatibility (generality) with other objects that support the buffer interface.
Here PyArg_ParseTuple(args, "s*", &pbuf)
creates the corresponding buffer object. This operation requires additional compute time while when working with text mode, Python simply parses the argument as an Object ("O"
) at almost no cost. Retrieving the data and length via
s = PyString_AS_STRING(text);n = PyString_GET_SIZE(text);
is also performed when the buffer is created.
This means that when working in binary mode there's an additional overhead associated with creating an auxiliary buffer object from the string object. For that reason the execution time is longer when working in binary mode.
Post a Comment for "In Python 2.7 Why Are Strings Written Faster In Text Mode Than In Binary Mode?"