Proper Way To Search Through Line Of Text, Re.findall() And Re.search() Both Don't Fully Work
My question is a little odd, maybe someone can provide some guidance. I have a line of text that I need to search through and pull out multiple recurring strings to populate a data
Solution 1:
With re.finditer
function and specific regex pattern:
import pandas as pd
import re
txt = "Name : 'red' Wire : 'R' Name : 'blue' Wire: 'B' Name : 'orange' Name: 'yellow' Wire : 'Y'"
pat = re.compile(r"Name\s*:\s*'(?P<Name>[^']+)'\s+Wire\s*:\s*'(?P<Wire>[^']+)'")
items = [m.groupdict() for m in pat.finditer(txt)]
df = pd.DataFrame(items)
print(df)
(?P<Name>[^']+)
- named subgroup which is "translated" tom.groupdict()
object
The output:
Name Wire
0 red R
1 blue B2 yellow Y
Solution 2:
I'm not used to pandas
but I achieved that with a list comprehension, maybe will be helpful to you:
import re
defpopulateNameWire(content):
pairs = re.findall(r'Name *: *\'(?P<name>\w+)\' Wire *: *\'(?P<wire>\w+)\'', content)
return [{'Name': name, 'Wire': wire} for name, wire in pairs]
populateNameWire("Name : 'red' Wire : 'R' Name : 'blue' Wire: 'B' Name : 'orange' Name: 'yellow' Wire : 'Y'")`
[{'Name': 'red', 'Wire': 'R'}, {'Name': 'blue', 'Wire': 'B'}, {'Name': 'yellow', 'Wire': 'Y'}]
Post a Comment for "Proper Way To Search Through Line Of Text, Re.findall() And Re.search() Both Don't Fully Work"