Skip to content Skip to sidebar Skip to footer

Proper Way To Search Through Line Of Text, Re.findall() And Re.search() Both Don't Fully Work

My question is a little odd, maybe someone can provide some guidance. I have a line of text that I need to search through and pull out multiple recurring strings to populate a data

Solution 1:

With re.finditer function and specific regex pattern:

import pandas as pd
import re

txt = "Name : 'red' Wire : 'R' Name : 'blue' Wire: 'B' Name : 'orange' Name: 'yellow' Wire : 'Y'"
pat = re.compile(r"Name\s*:\s*'(?P<Name>[^']+)'\s+Wire\s*:\s*'(?P<Wire>[^']+)'")
items = [m.groupdict() for m in pat.finditer(txt)]
df = pd.DataFrame(items)
print(df)
  • (?P<Name>[^']+) - named subgroup which is "translated" to m.groupdict() object

The output:

    Name Wire
0     red    R
1    blue    B2  yellow    Y

Solution 2:

I'm not used to pandas but I achieved that with a list comprehension, maybe will be helpful to you:

import re

defpopulateNameWire(content):
    pairs = re.findall(r'Name *: *\'(?P<name>\w+)\' Wire *: *\'(?P<wire>\w+)\'', content)
    return [{'Name': name, 'Wire': wire} for name, wire in pairs]
populateNameWire("Name : 'red' Wire : 'R' Name : 'blue' Wire: 'B' Name : 'orange' Name: 'yellow' Wire : 'Y'")`
[{'Name': 'red', 'Wire': 'R'}, {'Name': 'blue', 'Wire': 'B'}, {'Name': 'yellow', 'Wire': 'Y'}]

Post a Comment for "Proper Way To Search Through Line Of Text, Re.findall() And Re.search() Both Don't Fully Work"