Skip to content Skip to sidebar Skip to footer

Why Does Csv.dictreader Skip Empty Lines?

It seems that csv.DictReader skips empty lines, even when restval is set. Using the following, empty lines in the input file are skipped: import csv CSV_FIELDS = ('field1', 'field2

Solution 1:

Inside the csv.DictReader class:

# unlike the basic reader, we prefer not to return blanks,# because we will typically wind up with a dict full of None# valueswhile row == []:
        row = self.reader.next()

So empty rows are skipped. If you don't want to skip empty lines, you could instead use csv.reader.

Another option is to subclass csv.DictReader:

import csv
CSV_FIELDS = ("field1", "field2", "field3")

classMyDictReader(csv.DictReader):
    defnext(self):
        if self.line_num == 0:
            # Used only for its side effect.
            self.fieldnames
        row = self.reader.next()
        self.line_num = self.reader.line_num

        d = dict(zip(self.fieldnames, row))
        lf = len(self.fieldnames)
        lr = len(row)
        if lf < lr:
            d[self.restkey] = row[lf:]
        elif lf > lr:
            for key in self.fieldnames[lr:]:
                d[key] = self.restval
        return d

for row in MyDictReader(open("f", 'rb'), fieldnames=CSV_FIELDS, restval=""):
    print(row)

yields

{'field2': '2', 'field3': '3', 'field1': '1'}
{'field2': '', 'field3': '', 'field1': ''}
{'field2': '', 'field3': '', 'field1': ''}
{'field2': 'b', 'field3': 'c', 'field1': 'a'}

Solution 2:

Unutbu already pointed out to the reason why this is happening, anyways a quick fix will be replace empty lines with ',' before passing them to DictReader then restval will take care of the rest of the things.

CSV_FIELDS = ("field1", "field2", "field3")

with open('test.csv') as f:
    lines = (','if line.isspace() else line for line in f)
    for row in csv.DictReader(lines, fieldnames=CSV_FIELDS, restval=""):
        print row

#output
{'field2': '2', 'field3': '3', 'field1': '1'}
{'field2': '', 'field3': '', 'field1': ''}
{'field2': '', 'field3': '', 'field1': ''}
{'field2': 'b', 'field3': 'c', 'field1': 'a'}

Update:

In case of multi-line empty values the above code won't do it, in that case you can use csv.reader like this:

RESTVAL = ''withopen('test.csv') as f:
    for row in csv.reader(f, quotechar='"'):
        ifnot row:
            # Don't use `dict.fromkeys` if RESTVAL is a mutable object# {k: RESTVAL for k in CSV_FIELDS}printdict.fromkeys(CSV_FIELDS, RESTVAL)
        else:
            print {k: v if v else RESTVAL for k, v inzip(CSV_FIELDS, row)}

If file contains:

1,2,"


4"


a,b,c

then the output will be:

{'field2': '2', 'field3': '\n\n\n4', 'field1': '1'}
{'field2': '', 'field3': '', 'field1': ''}
{'field2': '', 'field3': '', 'field1': ''}
{'field2': 'b', 'field3': 'c', 'field1': 'a'}

Solution 3:

This is your file :

1,2,3
,,
,,
a,b,c

I add coma and now he takes two empty lines {'field2': '', 'field3': '', 'field1': ''} For restval argument, it just say if you have set fields but one is missing, the other values go to this value.

So you set three fields and there are three values each time. But we talk about 'columns' right here and not lines.

Your lines were empty so he skipped it, unless you specify with comas he needs to take empty values, for dictreader.

Post a Comment for "Why Does Csv.dictreader Skip Empty Lines?"