Skip to content Skip to sidebar Skip to footer

In Python 2.7 Why Are Strings Written Faster In Text Mode Than In Binary Mode?

The following example script writes some strings to a file using either 'w', text, or 'wb', binary mode: import itertools as it from string import ascii_lowercase import time char

Solution 1:

Looking at the source code for file.write reveals the following difference between binary mode and text mode:

if (f->f_binary) {
    if (!PyArg_ParseTuple(args, "s*", &pbuf))
        returnNULL;
    s = pbuf.buf;
    n = pbuf.len;
}
else {
    PyObject *text;
    if (!PyArg_ParseTuple(args, "O", &text))
        returnNULL;

    if (PyString_Check(text)) {
        s = PyString_AS_STRING(text);
        n = PyString_GET_SIZE(text);
    }

Here f->f_binary is set when the mode for open includes "b". In this case Python constructs an auxiliary buffer object from the string object and then gets the data s and length n from that buffer. I suppose this is for compatibility (generality) with other objects that support the buffer interface.

Here PyArg_ParseTuple(args, "s*", &pbuf) creates the corresponding buffer object. This operation requires additional compute time while when working with text mode, Python simply parses the argument as an Object ("O") at almost no cost. Retrieving the data and length via

s = PyString_AS_STRING(text);n = PyString_GET_SIZE(text);

is also performed when the buffer is created.

This means that when working in binary mode there's an additional overhead associated with creating an auxiliary buffer object from the string object. For that reason the execution time is longer when working in binary mode.

Post a Comment for "In Python 2.7 Why Are Strings Written Faster In Text Mode Than In Binary Mode?"