100daysofcode-with-python-course/transcripts/82-dataviz/3.txt at fixit · feoh/100daysofcode-with-python-course

executable file
132 lines (132 loc) · 5.21 KB
01 Let's use feedparser to get the
03 RSS feed of our blog.
04 And I'm not going to do the live feed
07 because I want you to see the same results
10 if you go through this exercise.
12 So, I'm going to paste in the actual copy I made.
15 That said, if you do want the live data,
18 then just go to our blog, pybit.es, or PyBites.
22 Go to 'view page source' and search for RSS,
27 and we have two feeds: all.atom and all.rss.
30 I'm using the latter.
31 The nice thing about feedparser is that
34 you can just call ".parse" and it does a lot
37 of stuff behind it.
38 Let's see what it does.
39 Okay, so entries.
41 Blog feed, my variable.
43 And let's look at what "entries" has.
45 And it gives me a lot,
47 so let's look at the first one.
51 Actually, if you want to pretty-print this, just do from
57 pprint import pprint,
59 and I usually give it an alias.
03 And now it's spaced out a bit better.
05 Look at that, what feedparser did
06 behind the scene.
07 It took that RSS feed and put it in
09 a comprehensible data structure.
12 Although this is a nice format,
14 there is still some work to do.
15 For example, the publish date is a string.
18 But we also have publish parse,
20 but it's at a time, a struc time.
22 Now, the most convenient way to work with dates
25 is to use datetime.
27 So, let's write a helper to convert to a datetime.
31 And I call it just that.
32 It takes a date string.
34 A date string here has some time zone stuff,
37 plus zero one.
40 So, the first thing is to strip that off.
45 Just put it here so we can see it while I'm writing this.
52 Date string.
54 I can split it on plus and take the first element.
00 We can see this live.
09 And you cannot see this, but there's still
11 a pending space here, so it's best to always
15 strip spaces that are not really needed.
19 So, you can see here that it disappeared.
21 And then, we do a datetime conversion.
24 And we can do that by strptime.
28 It takes a string, and the only tricky thing is that
31 you have to give it the format of the date string.
35 This case, it's a week day, a day,
37 a string month, so like a three-char. month,
40 Jan., Feb., March,
42 four digits here, uppercase Y,
45 hour, minute, seconds,
47 and let's see what they'll give me.
48 Okay, the nice thing about a datetime
50 is that it actually prints us a string.
52 So it's a bit tricky.
53 And if I look at what the datetime actually is,
57 it's the datetime.
58 And that's cool because datetime makes it then
00 very easy to work with dates.
02 For example, let's just return this.
06 So, I'm getting a datetime back.
07 What's cool about this is you can now
09 do calculations with datetimes.
13 So, let's make sure that
14 timedelta also here.
16 What if I want to...
18 So, this is the seventh of January.
20 But if I want to add like three days, right,
23 I could do datetime + timedelta(days=3)
29 And look at that.
30 I just added three days.
32 I mean, you don't even want to imagine
34 doing that on strings, right?
36 It's just not done, and... no.
39 It's totally no way to go.
41 So, when you're working with dates,
43 have it in a datetime format.
44 Have it in a standardized way that you can
47 easily do calculations with it.
49 And, actually, for this exercise
51 I just want to have the datetime.year.
53 That's another advantage you see here.
56 Ones I have the datetime, I can just pull out
58 different elements from that, right?
00 So here, I want the year and the month,
02 and I can just access that attribute wise.
05 So, now I just get a string.
08 We will use this later to plot the data.
11 The second helper I need is a get category.
14 Takes a link.
18 So, it takes a link, and it extracts the category
20 out of that.
21 And we have these known categories,
23 code challenge, new, special, and guest.
26 So, that's the dictionary.
28 The default. should be an article.
33 And here I use a bit of regular expressions
36 to pull the category out of the link.
39 A raw string, any characters.
42 A literal .es/
48 one or more lower-case letters.
51 + says one or more, and anything after that.
55 Now the parentheses will capture this
58 one or more letters into a match,
00 and I can access that in the second argument
03 by the \1.
06 And I'm doing that on link.
09 And then, I can just do a nice get on the dictionary,
14 which will look for that category.
18 So it matches code challenge Twitter,
20 special or guest,
21 if it finds it's cool.
23 If not, get will return None.
25 And it then goes to the or,
27 which returns default.
29 So this will always return something relevant, right?
31 Or, I find the key in the dictionary.
34 If not, I will return default.
37 And that's it.
38 That's the pre-work we are going to do
41 to important helpers.
42 Next up, we will go through the feed data,
46 putting it into some useful data structures.
49 And with that second part of the preparation done,
51 the plotting should be easy.
Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FilesExpand file tree

3.txt

Latest commit

History

3.txt

File metadata and controls