Hashtag Counter Python
Solution 1:
Fundamentally, your function doesn’t work because this line
hash_index = post_string.find(char)
Will always find the index of the first hash tag in the string. This could be fixed by providing a start index to str.find
, or, better, by not calling str.find
at all and instead maintaining the index when iterating over the string (you can use enumerate
for that). Better yet, don’t use an index, you don’t need it if you restructure your parser to use a state machine.
That said, a Pythonic implementation would replace the whole function with a regular expression, which would make it drastically shorter, correct, more readable, and likely more efficient.
Solution 2:
This should work:
import string
alpha = string.ascii_letters + string.digits
defanalyze(posts):
hashtag_dict = {}
for post in posts:
for i in post.split():
if i[0] == '#':
current_hashtag = sanitize(i[1:])
iflen(current_hashtag) > 0:
if current_hashtag in hashtag_dict:
hashtag_dict[current_hashtag] += 1else:
hashtag_dict[current_hashtag] = 1return hashtag_dict
defsanitize(s):
s2 = ''for i in s:
if i in alpha:
s2 += i
else:
breakreturn s2
posts = [
"hi #weekend",
"good morning #zurich #limmat",
"spend my #weekend in #zurich",
"#zurich <3",
"#lindehof4Ever(lol)"
]
print(analyze(posts))
Solution 3:
With your help, I managed to get 2.75 points out of 4. Thanks a lot! I didn't copy-paste any of your solutions into the correction tool, I used my own version that I tried to improve with your suggestions. (I am sure if I posted any of your solutions I would've gotten 4/4.)
According to them, the official solution would have been:
defanalyze(posts):
tags = {}
for post in posts:
curHashtag = Nonefor c in post:
is_allowed_char = c.isalnum()
if curHashtag != Noneandnot is_allowed_char:
iflen(curHashtag) > 0andnot curHashtag[0].isdigit():
if curHashtag in tags.keys():
tags[curHashtag] += 1else:
tags[curHashtag] = 1
curHashtag = Noneif c == "#":
curHashtag = ""continueif c.isalnum() and curHashtag != None:
curHashtag += c
if curHashtag != None:
iflen(curHashtag) > 0andnot curHashtag[0].isdigit():
if curHashtag in tags.keys():
tags[curHashtag] += 1else:
tags[curHashtag] = 1return tags
This is of course not an elegant solution, but a solution using exclusively what we have learned so far. Maybe this helps another beginner, who wants to use the tools they have to solve this exercise.
Solution 4:
Well,
this task can be done with regexes, don't be afraid to use them ;) Some quick solution.
#!/usr/bin/python3.4import re
PATTERN = re.compile(r'#(\w+)')
posts = [
"hi #weekend",
"good morning #zurich #limmat",
"spend my #weekend in #zurich",
"#zurich <3"]
container = {}
for post in posts:
for element in PATTERN.findall(elements):
container[element] = container.get(element, 0) + 1print(container)
Result:
{'zurich': 3, 'limmat': 1, 'weekend': 2}
EDIT
I would like to use here Counter from collections aswell.
#!/usr/bin/python3.4import re
from collections import Counter
PATTERN = re.compile(r'#(\w+)')
posts = [
"hi #weekend",
"good morning #zurich #limmat",
"spend my #weekend in #zurich",
"#zurich <3"]
words = [word for post in posts for word in PATTERN.findall(post)]
counted = Counter(words)
print(counted)
# Result: Counter({'zurich': 3, 'weekend': 2, 'limmat': 1})
Post a Comment for "Hashtag Counter Python"