Skip to content Skip to sidebar Skip to footer

Extracting A Date From A Log File?

I am attempting to create a DateTime object from a log file example string. I have been trying to use a regex to parse this but it fails whenever I get to the format of the logfile

Solution 1:

Not sure if you want this, but generating a datetime object from a string can be very complicated if your string is kind of free style. But we have dateutil package to help:

>>>import dateutil.parser>>>s = 'ERROR 2019-02-03T23:21:20 cannot find file'>>>dateutil.parser.parse(s, fuzzy=True)
datetime.datetime(2019, 2, 3, 23, 21, 20)

So if you like it, this is the function:

defconvert_to_datetime(s):
    return dateutil.parser.parse(s, fuzzy=True)

Solution 2:

You need to print the groups you matched too.

import re

s = 'ERROR 2019-02-03T23:21:20 cannot find file'match = re.search('\d{4}-\d{2}-\d{2}', s)
print(match.group(0))
#2019-02-03

Also if you want to get the whole datetime string, you can do

import re
s = 'ERROR 2019-02-03T23:21:20 cannot find file'match = re.search('\d{4}-\d{2}-\d{2}T\d{2}:\d{2}:\d{2}', s)
print(match.group(0))
#2019-02-03T23:21:20

After this if you want to get the datetime object you can use the https://pypi.org/project/python-dateutil/ library

from dateutil import parser
import re

s = 'ERROR 2019-02-03T23:21:20 cannot find file'
match = re.search('\d{4}-\d{2}-\d{2}T\d{2}:\d{2}:\d{2}', s)

#Datetime string
dt = match.group(0)

#Datetime object
dt_obj = parser.parse(dt)
print(dt_obj)
#2019-02-03 23:21:20print(type(dt_obj))
#<class 'datetime.datetime'>

Or the best solution, use the parser function defined above with fuzzy=True

from dateutil import parser

s = 'ERROR 2019-02-03T23:21:20 cannot find file'print(parser.parse(s, fuzzy=True))
#2019-02-03 23:21:20

Solution 3:

Your close. You just need to get the result:

def convert_to_datetime(line):
    match = re.search('\d{4}-\d{2}-\d{2}', line)
    returnmatch.group() ifmatchelse"No match"

Test:

t = convert_to_datetime('ERROR 2019-02-03T23:21:20 cannot find file')
print(t)

Output:

2019-02-03

Solution 4:

First, after reading https://docs.python.org/3/library/re.html be careful than in Python 3 \d is not exactly equivalent to [0-9], Then, be careful if there is no match pattern.match will raise an error try something like

pattern = re.compile('[0-9]{4}-[0-9]{2}-[0-9]{2}')

if pattern.search(line):
    matches.append(pattern.search(line))
...

Solution 5:

Depending on what format you want the final string, here are 2 ways you can do this:

import re


defconvert_to_datetime(line: str):
    match = re.search('\d{4}-\d{2}-\d{2}', line.strip('T')).group()
    match += ' | ' + re.search('\d{2}:\d{2}:\d{2}', line).group()
    return match


defcut_out_datetime(line: str):
    line = re.sub('ERROR ', "", line)
    line = re.sub('T', " | ", line)
    return line


s = 'ERROR 2019-02-03T23:21:20'print('   Test string: ', s)
print()
print('Extract method: ', convert_to_datetime(s))
print(' "Trim" method: ', cut_out_datetime(s))


# OUTPUT:
   Test string:  ERROR 2019-02-03T23:21:20

Extract method:  2019-02-03 | 23:21:20"Trim" method:  2019-02-03 | 23:21:20

[Done] exited with code=0in0.05 seconds

There are other ways with positions and slicing, but this is most similar to your original code. Replace the | as you see fit or break the time and date into 2 separate strings ...

Post a Comment for "Extracting A Date From A Log File?"