Skip to content Skip to sidebar Skip to footer

Comparing Sublists And Merging Them

I have a list that contains a lot of sublists, which are initially pairs of numbers, so it looks like: list = [[2, 3], [4, 5], [7, 8], [8, 9], [11, 12], [14, 15], [15, 16], [16, 17

Solution 1:

I think I wrote this once. It can be done with a single pass over the list.

alist = [[2, 3], [4, 5], [7, 8], [8, 9], [11, 12], [14, 15], [15, 16], [16, 17], [17, 18], [18, 19], [20, 21]]

l = [alist[0][:]]
for e in alist[1:]:
   if l[-1][-1] == e[0]:
      l[-1].append(e[1])
   else:
      l.append(e[:])

The code reads as start with the first pair. Loop over the rest. Check if the last element of the last list is the same as the first element of the pair. If so append the second element else append the pair to the list.

This results in l being:

[[2, 3], [4, 5], [7, 8, 9], [11, 12], [14, 15, 16, 17, 18, 19], [20, 21]]

If you only want the largest sublist I suggest:

>>> l = [[2, 3], [4, 5], [7, 8, 9], [11, 12], [14, 15, 16, 17, 18, 19], [20, 21]]
>>> max(l, key=len)
[14, 15, 16, 17, 18, 19]

And evaluated:

>>>alist = [[2, 3], [4, 5], [7, 8], [8, 9], [11, 12], [14, 15], [15, 16], [16, 17], [17, 18], [18, 19], [20, 21]]>>>>>>l = [alist[0][:]]>>>for e in alist[1:]:...if l[-1][-1] == e[0]:...      l[-1].append(e[1])...else:...      l.append(e[:])...>>>l
[[2, 3], [4, 5], [7, 8, 9], [11, 12], [14, 15, 16, 17, 18, 19], [20, 21]]
>>>alist
[[2, 3], [4, 5], [7, 8], [8, 9], [11, 12], [14, 15], [15, 16], [16, 17], [17, 18], [18, 19], [20, 21]]

And compared. The reduce solution takes 6.4 usecs:

$ python -mtimeit "list = [[2, 3], [4, 5], [7, 8], [8, 9], [11, 12], [14, 15], [15, 16], [16, 17], [17, 18], [18, 19], [20, 21]]""reduce(lambda x,y: x[:-1] + [x[-1] + y[1:]] if x[-1][-1] == y[0] else x + [y], list[1:], [list[0]])"100000 loops, best of 3: 6.4 usec per loop

The for loop takes 3.62 usecs:

$ python -mtimeit "alist = [[2, 3], [4, 5], [7, 8], [8, 9], [11, 12], [14, 15], [15, 16], [16, 17], [17, 18], [18, 19], [20, 21]]""l = [alist[0][:]]""for e in alist[1:]:""   if l[-1][-1] == e[0]:""      l[-1].append(e[1])""   else:""      l.append(e[:])"100000 loops, best of 3: 3.62 usec per loop

On Python 2.7.3. The for loop is 56% faster. The difference would likely be more pronounced with larger inputs as the cost of a list concatenation depends on the sum of the length of the two lists. Whereas appending to a list is slightly cheaper.

Solution 2:

using reduce

>>>reduce(...lambda x,y: x[:-1] + [x[-1] + y[1:]] if x[-1][-1] == y[0] else x + [y],...list[1:],...[list[0]]...)
[[2, 3], [4, 5], [7, 8, 9], [11, 12], [14, 15, 16, 17, 18, 19], [20, 21]]

Explanation

Here is the lamdba function in expanded form used with reduce.

def mergeOverlappingRange(x, y):
    if x[-1][-1] == y[0]: 
        return x[:-1] + [x[-1] + y[1:]]
    else:
        return x + [y]

reduce(mergeOverlappingRange, list[1:], [list[0]])

Post a Comment for "Comparing Sublists And Merging Them"