diff --git a/.ipynb_checkpoints/Scrapper-checkpoint.ipynb b/.ipynb_checkpoints/Scrapper-checkpoint.ipynb
index db7f202..4b74f53 100644
--- a/.ipynb_checkpoints/Scrapper-checkpoint.ipynb
+++ b/.ipynb_checkpoints/Scrapper-checkpoint.ipynb
@@ -2140,7 +2140,7 @@
"cell_type": "code",
"execution_count": 11,
"metadata": {
- "scrolled": false
+ "scrolled": true
},
"outputs": [
{
@@ -2398,249 +2398,43 @@
"metadata": {},
"outputs": [],
"source": [
- "d_copy=data.copy()"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 14,
- "metadata": {},
- "outputs": [],
- "source": [
- "d_copy['Views'][810]=d_copy['Views'][810].replace(\"[a]\", \"\")\n",
- "\n",
- "for i in range (d_copy.shape[0]):\n",
- " if (',' in d_copy['Views'][i]):\n",
- " d_copy['Views'][i]=int(d_copy['Views'][i].replace(\",\", \"\"))\n",
- " elif ('.' in d_copy['Views'][i]):\n",
- " d_copy['Views'][i]=int(d_copy['Views'][i].replace(\".\", \"\"))"
+ "data['Views']=data['Views'].replace(\",\", \"\")"
]
},
{
"cell_type": "code",
- "execution_count": 15,
+ "execution_count": 24,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
- "0 1987464\n",
- "1 1800417\n",
- "2 1745381\n",
- "3 1680347\n",
- "4 1589558\n",
- " ... \n",
- "1071 487160\n",
- "1072 462709\n",
- "1073 450514\n",
- "1074 440814\n",
- "1075 430351\n",
+ "0 1,987,464\n",
+ "1 1,800,417\n",
+ "2 1,745,381\n",
+ "3 1,680,347\n",
+ "4 1,589,558\n",
+ " ... \n",
+ "1071 487,160\n",
+ "1072 462,709\n",
+ "1073 450,514\n",
+ "1074 440,814\n",
+ "1075 430,351\n",
"Name: Views, Length: 1076, dtype: object"
]
},
- "execution_count": 15,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "d_copy['Views']"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 16,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/html": [
- "
\n",
- "\n",
- "
\n",
- " \n",
- " \n",
- " | \n",
- " Rank | \n",
- " Article | \n",
- " Class | \n",
- " Views | \n",
- " Image | \n",
- " Notes/about | \n",
- "
\n",
- " \n",
- " \n",
- " \n",
- " | 0 | \n",
- " 1 | \n",
- " Tanya Roberts | \n",
- " B-Class article | \n",
- " 1987464 | \n",
- " | \n",
- " Whether you remember her as a Bond girl, a Charlie's Angel, or maybe the funniest character on That '70s Show, Roberts is probably an actress you liked. There was little fanfare when she was taken to a hospital for breathing problems on December 24. There was much more to talk about when she was reported dead. Then alive. On a cycle for three days. A miscommunication between her partner and her manager had the media reporting her death on January 3, which was soon retracted, but she passed away the following evening. Possibly the first scandal of 2021, it was quickly overshadowed by, well, the entire rest of the week (except on this list, obviously, where politics only starts at #3.) | \n",
- "
\n",
- " \n",
- " | 1 | \n",
- " 2 | \n",
- " Bridgerton | \n",
- " C-Class article | \n",
- " 1800417 | \n",
- " | \n",
- " So how about them Netflix period dramas, huh? Real entertaining. | \n",
- "
\n",
- " \n",
- " | 2 | \n",
- " 3 | \n",
- " Twenty-fifth Amendment to the United States Constitution | \n",
- " Good article | \n",
- " 1745381 | \n",
- " | \n",
- " Following the Capitol storming, there was talk of removing Trump from power in his last 2 weeks of office. Impeachment needs a House majority and conviction requires a Senate supermajority, but would prevent the President from ever taking office again. The 25th Amendment requires support from the Vice President and a majority of Cabinet officials, but only forces a temporary abdication. Every politician worth his salt is calling for at least one of these, except Biden (in public), who has to be the dutiful mature leader of the sort-of free world already (in public)... We can only assume CNN said \"25th Amendment\" and everyone realized they don't really know which one that is. Well, as soon as you get to the 20s it's hard to remember it all: I know one of them ended Prohibition, and (now) that there's a clause in the 25th that allows the Cabinet to declare the President unfit - a clause that has never been invoked. | \n",
- "
\n",
- " \n",
- " | 3 | \n",
- " 4 | \n",
- " Jon Ossoff | \n",
- " C-Class article | \n",
- " 1680347 | \n",
- " | \n",
- " Biden won in Georgia - a solid red state - last November, and Democrats Ossoff and Warnock were able to force runoff elections for their Senate seats. The runoff elections were held Tuesday, and both Democrats narrowly won. The Democratic Party now has the slimmest majority on the Senate - news announced on Wednesday, when a different transfer of power was taking place. | \n",
- "
\n",
- " \n",
- " | 4 | \n",
- " 5 | \n",
- " 2021 storming of the United States Capitol | \n",
- " C-Class article | \n",
- " 1589558 | \n",
- " | \n",
- " The final procedure of a United States presidential election is the certification of Electoral College votes by Congress. This is a ceremonial procedure, and elections are rarely decided at this stage. President Trump held a rally outside of Congress on January 6, the day of the certification. Trump left in the morning; at 2:15 p.m., barricades around the Capitol building were breached by protestors, and Congresspeople were evacuated to a bunker soon after. Protestors were mostly interested in taking photos for social media, and were removed soon after.While only 13 were arrested on the day of the storming, several have been arrested after the fact; this includes a W.V. state delegate that livestreamed the breach, the guy that stole Nancy Pelosi's lectern, and the \"QAnon\" shaman. | \n",
- "
\n",
- " \n",
- " | ... | \n",
- " ... | \n",
- " ... | \n",
- " ... | \n",
- " ... | \n",
- " ... | \n",
- " ... | \n",
- "
\n",
- " \n",
- " | 1071 | \n",
- " 21 | \n",
- " Elon Musk | \n",
- " Good article | \n",
- " 487160 | \n",
- " | \n",
- " Sigh... what has he done now?I'm assuming the reason Mr. Musk is on this list is that his net worth surpassed US$300 billion on October 29, making him the richest person in history. The timing of this was pretty rich (pun intended) because it happened right after US lawmakers proposed a billionaires' tax that would definitely take away some of that wealth. | \n",
- "
\n",
- " \n",
- " | 1072 | \n",
- " 22 | \n",
- " Halloween Kills | \n",
- " C-Class article | \n",
- " 462709 | \n",
- " | \n",
- " “—What the fuck is this mask?—Austin Powers.—I said Michael Myers.—This *is* Mike Myers.—It should be the Halloween mask.—This is a Halloween (#10) mask!”Baby Driver quotes aside, in spite of mixed reviews this slasher film has been making a killing at the box office, easily recouping its $20 million budget many times over. But hearing that next year's follow-up Halloween Ends is seemingly set in the present day, post-COVID and all, is certainly a headscratcher. | \n",
- "
\n",
- " \n",
- " | 1073 | \n",
- " 23 | \n",
- " Puneeth Rajkumar filmography | \n",
- " List-Class article | \n",
- " 450514 | \n",
- " | \n",
- " Back to our #1, his body of work, that started with a film appearance at just six months old, and his older brother. | \n",
- "
\n",
- " \n",
- " | 1074 | \n",
- " 24 | \n",
- " Shiva Rajkumar | \n",
- " Start-Class article | \n",
- " 440814 | \n",
- " | \n",
- " This holiday, celebrating the Constitution of India -- the longest written constitution of any country -- going into effect was celebrated on January 26 this year, as it is every year. | \n",
- "
\n",
- " \n",
- " | 1075 | \n",
- " 25 | \n",
- " Victoria Pedretti | \n",
- " Start-Class article | \n",
- " 430351 | \n",
- " | \n",
- " Keeping off the list the first film adaptation of #4 and one of #2's 2019's stars (although she only has six minutes of screentime, given the movie ends right as her character really enters the plot) is the female lead of #18. If only we could get an image of her on Commons! | \n",
- "
\n",
- " \n",
- "
\n",
- "
1076 rows × 6 columns
\n",
- "
"
- ],
- "text/plain": [
- " Rank Article \\\n",
- "0 1 Tanya Roberts \n",
- "1 2 Bridgerton \n",
- "2 3 Twenty-fifth Amendment to the United States Constitution \n",
- "3 4 Jon Ossoff \n",
- "4 5 2021 storming of the United States Capitol \n",
- "... ... ... \n",
- "1071 21 Elon Musk \n",
- "1072 22 Halloween Kills \n",
- "1073 23 Puneeth Rajkumar filmography \n",
- "1074 24 Shiva Rajkumar \n",
- "1075 25 Victoria Pedretti \n",
- "\n",
- " Class Views Image \\\n",
- "0 B-Class article 1987464 \n",
- "1 C-Class article 1800417 \n",
- "2 Good article 1745381 \n",
- "3 C-Class article 1680347 \n",
- "4 C-Class article 1589558 \n",
- "... ... ... ... \n",
- "1071 Good article 487160 \n",
- "1072 C-Class article 462709 \n",
- "1073 List-Class article 450514 \n",
- "1074 Start-Class article 440814 \n",
- "1075 Start-Class article 430351 \n",
- "\n",
- " Notes/about \n",
- "0 Whether you remember her as a Bond girl, a Charlie's Angel, or maybe the funniest character on That '70s Show, Roberts is probably an actress you liked. There was little fanfare when she was taken to a hospital for breathing problems on December 24. There was much more to talk about when she was reported dead. Then alive. On a cycle for three days. A miscommunication between her partner and her manager had the media reporting her death on January 3, which was soon retracted, but she passed away the following evening. Possibly the first scandal of 2021, it was quickly overshadowed by, well, the entire rest of the week (except on this list, obviously, where politics only starts at #3.) \n",
- "1 So how about them Netflix period dramas, huh? Real entertaining. \n",
- "2 Following the Capitol storming, there was talk of removing Trump from power in his last 2 weeks of office. Impeachment needs a House majority and conviction requires a Senate supermajority, but would prevent the President from ever taking office again. The 25th Amendment requires support from the Vice President and a majority of Cabinet officials, but only forces a temporary abdication. Every politician worth his salt is calling for at least one of these, except Biden (in public), who has to be the dutiful mature leader of the sort-of free world already (in public)... We can only assume CNN said \"25th Amendment\" and everyone realized they don't really know which one that is. Well, as soon as you get to the 20s it's hard to remember it all: I know one of them ended Prohibition, and (now) that there's a clause in the 25th that allows the Cabinet to declare the President unfit - a clause that has never been invoked. \n",
- "3 Biden won in Georgia - a solid red state - last November, and Democrats Ossoff and Warnock were able to force runoff elections for their Senate seats. The runoff elections were held Tuesday, and both Democrats narrowly won. The Democratic Party now has the slimmest majority on the Senate - news announced on Wednesday, when a different transfer of power was taking place. \n",
- "4 The final procedure of a United States presidential election is the certification of Electoral College votes by Congress. This is a ceremonial procedure, and elections are rarely decided at this stage. President Trump held a rally outside of Congress on January 6, the day of the certification. Trump left in the morning; at 2:15 p.m., barricades around the Capitol building were breached by protestors, and Congresspeople were evacuated to a bunker soon after. Protestors were mostly interested in taking photos for social media, and were removed soon after.While only 13 were arrested on the day of the storming, several have been arrested after the fact; this includes a W.V. state delegate that livestreamed the breach, the guy that stole Nancy Pelosi's lectern, and the \"QAnon\" shaman. \n",
- "... ... \n",
- "1071 Sigh... what has he done now?I'm assuming the reason Mr. Musk is on this list is that his net worth surpassed US$300 billion on October 29, making him the richest person in history. The timing of this was pretty rich (pun intended) because it happened right after US lawmakers proposed a billionaires' tax that would definitely take away some of that wealth. \n",
- "1072 “—What the fuck is this mask?—Austin Powers.—I said Michael Myers.—This *is* Mike Myers.—It should be the Halloween mask.—This is a Halloween (#10) mask!”Baby Driver quotes aside, in spite of mixed reviews this slasher film has been making a killing at the box office, easily recouping its $20 million budget many times over. But hearing that next year's follow-up Halloween Ends is seemingly set in the present day, post-COVID and all, is certainly a headscratcher. \n",
- "1073 Back to our #1, his body of work, that started with a film appearance at just six months old, and his older brother. \n",
- "1074 This holiday, celebrating the Constitution of India -- the longest written constitution of any country -- going into effect was celebrated on January 26 this year, as it is every year. \n",
- "1075 Keeping off the list the first film adaptation of #4 and one of #2's 2019's stars (although she only has six minutes of screentime, given the movie ends right as her character really enters the plot) is the female lead of #18. If only we could get an image of her on Commons! \n",
- "\n",
- "[1076 rows x 6 columns]"
- ]
- },
- "execution_count": 16,
+ "execution_count": 24,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
- "d_copy"
+ "data[\"Views\"]"
]
},
{
"cell_type": "code",
- "execution_count": 17,
+ "execution_count": 21,
"metadata": {},
"outputs": [
{
@@ -2746,7 +2540,7 @@
"[660 rows x 2 columns]"
]
},
- "execution_count": 17,
+ "execution_count": 21,
"metadata": {},
"output_type": "execute_result"
}
@@ -2760,531 +2554,35 @@
},
{
"cell_type": "code",
- "execution_count": 18,
+ "execution_count": 22,
"metadata": {},
- "outputs": [
- {
- "data": {
- "text/html": [
- "\n",
- "\n",
- "
\n",
- " \n",
- " \n",
- " | \n",
- " Article | \n",
- " Nombre d'apparitions | \n",
- " Moyenne du nombre de vus | \n",
- " Notes/about | \n",
- "
\n",
- " \n",
- " \n",
- " \n",
- " | 0 | \n",
- " 1989 Tiananmen Square protests | \n",
- " 1 | \n",
- " 475850 | \n",
- " [32 years since the Chinese government responded to students protesting by sending in rifles and tanks (in the latter case, inspiring an iconic image, which seemingly vanished from Bing last week). And they have since done everything in their power to deny the violent events that ironically happened in a \"Heavenly Peace Square\", down to cracking down a vigil that happened in Hong Kong last week. Or somehow comparing the events to the invasion of the U.S. Capitol in January, when that one didn't end with hundreds of unarmed civilians being shot dead.] | \n",
- "
\n",
- " \n",
- " | 1 | \n",
- " 2012 Summer Olympics medal table | \n",
- " 1 | \n",
- " 541757 | \n",
- " [Because people need something to compare #8 to. Though why 2012 got more views than 2016? Well, it probably has to do with how before Tokyo (#1), London was India's (#4) best performance.] | \n",
- "
\n",
- " \n",
- " | 2 | \n",
- " 2016 West Bengal Legislative Assembly election | \n",
- " 1 | \n",
- " 564338 | \n",
- " [With #8, we wanted to remind themselves of what happened last time out.] | \n",
- "
\n",
- " \n",
- " | 3 | \n",
- " 2019 Canadian federal election | \n",
- " 1 | \n",
- " 679216 | \n",
- " [Two years ago, the Liberals didn't get a majority in the parliament. They tried again this year (#5), and still couldn't.] | \n",
- "
\n",
- " \n",
- " | 4 | \n",
- " 2020 Summer Olympics | \n",
- " 5 | \n",
- " 2123836.6 | \n",
- " [The opening ceremonies are scheduled for July 23. The games were planned to showcase how Japan bounced back from the 2011 tsunami, but instead became overshadowed by the COVID-19 pandemic. The pandemic has so far pushed back the games by a year, caused a regional lockdown weeks before the opening, and prompted the creation of a condom-free-but-probably-not-sex-free Olympic village. Doesn't help that in spite of being one body of water away from the pandemic's origin, Japan took very long to start vaccinating its population and now the populace is afraid of a COVID resurgence., Sports fans now will spend two weeks in the very unfavorable Japan Standard Time to see the multi-sport event that started in spite of the goddamned pandemic delaying it for a year (and the protests of a populace that started immunizing itself too late). One of #1's teammates, Jordan Nwora, will compete for the Nigerian basketball team - Giannis' Greece certainly missed him as they couldn't qualify., Sports fans of the New World (and maybe a few of the Old one as well) are currently sleep-deprived to fit into the Tokyo Standard Time where the biggest multi-sport event is happening. The pandemic that delayed the Olympics is still having its effects felt, with medalists forced to wear masks on the podium and such., One more week where sports fans supported their countries from a distance, even in host city Tokyo as the same pandemic that delayed it for a year forced events without outsiders or reduced crowds. The Games closed on the Sunday this Report was published, to the relief of those who are losing their sleep to watch events late at night. At least the next ones are only three years away and the winter one is 6 months from now!, An iconic talk show host whose career spanned over six decades, King passed away on Saturday last week, earning him a higher spot this week.] | \n",
- "
\n",
- " \n",
- " | ... | \n",
- " ... | \n",
- " ... | \n",
- " ... | \n",
- " ... | \n",
- "
\n",
- " \n",
- " | 655 | \n",
- " ZZ Top | \n",
- " 1 | \n",
- " 575452 | \n",
- " [A bearded Texan rock trio - including a guy actually named Beard, who after decades with only a mustache grew a goatee - who ever since 1969 have been responsible for many classic tunes such as \"La Grange\" and \"Sharp Dressed Man\" (and for Back to the Future fans, \"Doubleback\"). One of their frontmen died (#9) 5 days after the band played without him for the first time in 51 years - the other leader, Billy Gibbons, said Hill specifically asked him to put his guitar tech in his place.] | \n",
- "
\n",
- " \n",
- " | 656 | \n",
- " Zack Snyder | \n",
- " 2 | \n",
- " 651564.0 | \n",
- " [A director with a certainly distinctive style, translated into many comic book adaptations - most recently our #1 (and indirectly, #5). This year also has Snyder's heist-during-the-zombie-apocalypse flick Army of the Dead., Snyder is one of the clearer beneficiaries of vulgar auteurism; how else do you go from making the reviled-by-nerds 300 and Watchmen to getting those same nerds to push for millions of dollars in reshoots for a superhero movie?] | \n",
- "
\n",
- " \n",
- " | 657 | \n",
- " Zack Snyder's Justice League | \n",
- " 4 | \n",
- " 1632921.75 | \n",
- " [The Justice League that was released in 2017 had Snyder credited as a director, but much of the film was altered in post-production and reshoots by Joss Whedon. Some fans were convinced that the \"Snyder cut,\" which surely existed somewhere, was far better than the lackluster Whedon version. However, due to the way effects-heavy superhero films are produced, the cut was probably far from complete (Folding Ideas has a good video about why.) After $70 million of effects and reshoots, the Snyder cut is finally complete - with a trailer dropping this last Sunday., All the cries of #ReleaseTheSnyderCut were filled, at the asking price of an HBO Max subscription and 4 hours to kill. Reviews were positive, even if the indulgent nature of this superhero project (again: the thing lasts 4 hours!) is a contentious point., On one hand, the film is more consistent and the added content fleshes out characters such as Cyborg and Steppenwolf. On the other hand, it's also consistently too grim (the washed colors and moody soundtrack help) and the amount of slow motion is abusive. Still, a valid if overlong effort. Now let's see what lies ahead, even if fans aren't ready to move on (#21)., Fans are now griping #RestoreTheSnyderverse given the extended cut of Justice League was released, but Warner Bros. won't continue it. Well, they should focus on how the studio makes worse decisions regarding the DC Extended Universe than that - just because Darkseid appears in a half a dozen scenes of this 4 hour movie, Warner cancelled a promising New Gods movie that would feature the famed DC villain.] | \n",
- "
\n",
- " \n",
- " | 658 | \n",
- " Zitkala-Sa | \n",
- " 1 | \n",
- " 851153 | \n",
- " [This early-20th-century Yankton Dakota activist and author was born on February 22, 1876. For her birthday, she was commemorated by a Google Doodle.] | \n",
- "
\n",
- " \n",
- " | 659 | \n",
- " Zodiac Killer | \n",
- " 1 | \n",
- " 1088838 | \n",
- " [The Case Breakers, an independent team of 40 cold case investigators, claimed they identified this still mysterious murderer who terrorized the San Francisco Bay Area in the late 1960s. The police disagrees with their discovery, deeming it too reliant on circumstantial evidence.] | \n",
- "
\n",
- " \n",
- "
\n",
- "
660 rows × 4 columns
\n",
- "
"
- ],
- "text/plain": [
- " Article Nombre d'apparitions \\\n",
- "0 1989 Tiananmen Square protests 1 \n",
- "1 2012 Summer Olympics medal table 1 \n",
- "2 2016 West Bengal Legislative Assembly election 1 \n",
- "3 2019 Canadian federal election 1 \n",
- "4 2020 Summer Olympics 5 \n",
- ".. ... ... \n",
- "655 ZZ Top 1 \n",
- "656 Zack Snyder 2 \n",
- "657 Zack Snyder's Justice League 4 \n",
- "658 Zitkala-Sa 1 \n",
- "659 Zodiac Killer 1 \n",
- "\n",
- " Moyenne du nombre de vus \\\n",
- "0 475850 \n",
- "1 541757 \n",
- "2 564338 \n",
- "3 679216 \n",
- "4 2123836.6 \n",
- ".. ... \n",
- "655 575452 \n",
- "656 651564.0 \n",
- "657 1632921.75 \n",
- "658 851153 \n",
- "659 1088838 \n",
- "\n",
- " Notes/about \n",
- "0 [32 years since the Chinese government responded to students protesting by sending in rifles and tanks (in the latter case, inspiring an iconic image, which seemingly vanished from Bing last week). And they have since done everything in their power to deny the violent events that ironically happened in a \"Heavenly Peace Square\", down to cracking down a vigil that happened in Hong Kong last week. Or somehow comparing the events to the invasion of the U.S. Capitol in January, when that one didn't end with hundreds of unarmed civilians being shot dead.] \n",
- "1 [Because people need something to compare #8 to. Though why 2012 got more views than 2016? Well, it probably has to do with how before Tokyo (#1), London was India's (#4) best performance.] \n",
- "2 [With #8, we wanted to remind themselves of what happened last time out.] \n",
- "3 [Two years ago, the Liberals didn't get a majority in the parliament. They tried again this year (#5), and still couldn't.] \n",
- "4 [The opening ceremonies are scheduled for July 23. The games were planned to showcase how Japan bounced back from the 2011 tsunami, but instead became overshadowed by the COVID-19 pandemic. The pandemic has so far pushed back the games by a year, caused a regional lockdown weeks before the opening, and prompted the creation of a condom-free-but-probably-not-sex-free Olympic village. Doesn't help that in spite of being one body of water away from the pandemic's origin, Japan took very long to start vaccinating its population and now the populace is afraid of a COVID resurgence., Sports fans now will spend two weeks in the very unfavorable Japan Standard Time to see the multi-sport event that started in spite of the goddamned pandemic delaying it for a year (and the protests of a populace that started immunizing itself too late). One of #1's teammates, Jordan Nwora, will compete for the Nigerian basketball team - Giannis' Greece certainly missed him as they couldn't qualify., Sports fans of the New World (and maybe a few of the Old one as well) are currently sleep-deprived to fit into the Tokyo Standard Time where the biggest multi-sport event is happening. The pandemic that delayed the Olympics is still having its effects felt, with medalists forced to wear masks on the podium and such., One more week where sports fans supported their countries from a distance, even in host city Tokyo as the same pandemic that delayed it for a year forced events without outsiders or reduced crowds. The Games closed on the Sunday this Report was published, to the relief of those who are losing their sleep to watch events late at night. At least the next ones are only three years away and the winter one is 6 months from now!, An iconic talk show host whose career spanned over six decades, King passed away on Saturday last week, earning him a higher spot this week.] \n",
- ".. ... \n",
- "655 [A bearded Texan rock trio - including a guy actually named Beard, who after decades with only a mustache grew a goatee - who ever since 1969 have been responsible for many classic tunes such as \"La Grange\" and \"Sharp Dressed Man\" (and for Back to the Future fans, \"Doubleback\"). One of their frontmen died (#9) 5 days after the band played without him for the first time in 51 years - the other leader, Billy Gibbons, said Hill specifically asked him to put his guitar tech in his place.] \n",
- "656 [A director with a certainly distinctive style, translated into many comic book adaptations - most recently our #1 (and indirectly, #5). This year also has Snyder's heist-during-the-zombie-apocalypse flick Army of the Dead., Snyder is one of the clearer beneficiaries of vulgar auteurism; how else do you go from making the reviled-by-nerds 300 and Watchmen to getting those same nerds to push for millions of dollars in reshoots for a superhero movie?] \n",
- "657 [The Justice League that was released in 2017 had Snyder credited as a director, but much of the film was altered in post-production and reshoots by Joss Whedon. Some fans were convinced that the \"Snyder cut,\" which surely existed somewhere, was far better than the lackluster Whedon version. However, due to the way effects-heavy superhero films are produced, the cut was probably far from complete (Folding Ideas has a good video about why.) After $70 million of effects and reshoots, the Snyder cut is finally complete - with a trailer dropping this last Sunday., All the cries of #ReleaseTheSnyderCut were filled, at the asking price of an HBO Max subscription and 4 hours to kill. Reviews were positive, even if the indulgent nature of this superhero project (again: the thing lasts 4 hours!) is a contentious point., On one hand, the film is more consistent and the added content fleshes out characters such as Cyborg and Steppenwolf. On the other hand, it's also consistently too grim (the washed colors and moody soundtrack help) and the amount of slow motion is abusive. Still, a valid if overlong effort. Now let's see what lies ahead, even if fans aren't ready to move on (#21)., Fans are now griping #RestoreTheSnyderverse given the extended cut of Justice League was released, but Warner Bros. won't continue it. Well, they should focus on how the studio makes worse decisions regarding the DC Extended Universe than that - just because Darkseid appears in a half a dozen scenes of this 4 hour movie, Warner cancelled a promising New Gods movie that would feature the famed DC villain.] \n",
- "658 [This early-20th-century Yankton Dakota activist and author was born on February 22, 1876. For her birthday, she was commemorated by a Google Doodle.] \n",
- "659 [The Case Breakers, an independent team of 40 cold case investigators, claimed they identified this still mysterious murderer who terrorized the San Francisco Bay Area in the late 1960s. The police disagrees with their discovery, deeming it too reliant on circumstantial evidence.] \n",
- "\n",
- "[660 rows x 4 columns]"
- ]
- },
- "execution_count": 18,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
+ "outputs": [],
"source": [
"view=[]\n",
"notes=[]\n",
"\n",
"for i in range(new_data.shape[0]):\n",
- " if (new_data[\"Nombre d'apparitions\"][i]!=1):\n",
- " n=new_data[\"Nombre d'apparitions\"][i]\n",
- " list=d_copy[\"Views\"][d_copy.Article==new_data.Article[i]].tolist()\n",
- " view.append(sum(list)/n)\n",
- " else :\n",
- " view.append(d_copy[\"Views\"][d_copy.Article==new_data.Article[i]].tolist()[0])\n",
- " notes.append(d_copy[\"Notes/about\"][data.Article==new_data.Article[i]].tolist())\n",
- "\n",
- "new_data['Moyenne du nombre de vus']=view\n",
- "new_data['Notes/about']=notes\n",
- "new_data"
+ " notes.append(data[\"Notes/about\"][data.Article==new_data.Article[i]].tolist())\n",
+ " "
]
},
{
"cell_type": "code",
- "execution_count": 19,
- "metadata": {},
- "outputs": [
- {
- "name": "stderr",
- "output_type": "stream",
- "text": [
- "C:\\Users\\coral\\anaconda3\\lib\\site-packages\\pandas\\util\\_decorators.py:311: SettingWithCopyWarning: \n",
- "A value is trying to be set on a copy of a slice from a DataFrame\n",
- "\n",
- "See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy\n",
- " return func(*args, **kwargs)\n"
- ]
- },
- {
- "data": {
- "text/html": [
- "\n",
- "\n",
- "
\n",
- " \n",
- " \n",
- " | \n",
- " Article | \n",
- " Class | \n",
- "
\n",
- " \n",
- " \n",
- " \n",
- " | 0 | \n",
- " Tanya Roberts | \n",
- " B-Class article | \n",
- "
\n",
- " \n",
- " | 1 | \n",
- " Bridgerton | \n",
- " C-Class article | \n",
- "
\n",
- " \n",
- " | 2 | \n",
- " Twenty-fifth Amendment to the United States Constitution | \n",
- " Good article | \n",
- "
\n",
- " \n",
- " | 3 | \n",
- " Jon Ossoff | \n",
- " C-Class article | \n",
- "
\n",
- " \n",
- " | 4 | \n",
- " 2021 storming of the United States Capitol | \n",
- " C-Class article | \n",
- "
\n",
- " \n",
- " | ... | \n",
- " ... | \n",
- " ... | \n",
- "
\n",
- " \n",
- " | 655 | \n",
- " James Michael Tyler | \n",
- " Start-Class article | \n",
- "
\n",
- " \n",
- " | 656 | \n",
- " Dune (franchise) | \n",
- " B-Class article | \n",
- "
\n",
- " \n",
- " | 657 | \n",
- " Shaheen Afridi | \n",
- " Start-Class article | \n",
- "
\n",
- " \n",
- " | 658 | \n",
- " Puneeth Rajkumar filmography | \n",
- " List-Class article | \n",
- "
\n",
- " \n",
- " | 659 | \n",
- " Shiva Rajkumar | \n",
- " Start-Class article | \n",
- "
\n",
- " \n",
- "
\n",
- "
660 rows × 2 columns
\n",
- "
"
- ],
- "text/plain": [
- " Article \\\n",
- "0 Tanya Roberts \n",
- "1 Bridgerton \n",
- "2 Twenty-fifth Amendment to the United States Constitution \n",
- "3 Jon Ossoff \n",
- "4 2021 storming of the United States Capitol \n",
- ".. ... \n",
- "655 James Michael Tyler \n",
- "656 Dune (franchise) \n",
- "657 Shaheen Afridi \n",
- "658 Puneeth Rajkumar filmography \n",
- "659 Shiva Rajkumar \n",
- "\n",
- " Class \n",
- "0 B-Class article \n",
- "1 C-Class article \n",
- "2 Good article \n",
- "3 C-Class article \n",
- "4 C-Class article \n",
- ".. ... \n",
- "655 Start-Class article \n",
- "656 B-Class article \n",
- "657 Start-Class article \n",
- "658 List-Class article \n",
- "659 Start-Class article \n",
- "\n",
- "[660 rows x 2 columns]"
- ]
- },
- "execution_count": 19,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "df_classe=data[[\"Article\", \"Class\"]]\n",
- "df_classe.drop_duplicates(subset='Article', keep='first', inplace=True, ignore_index=True)\n",
- "df_classe"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 20,
+ "execution_count": null,
"metadata": {
"scrolled": true
},
- "outputs": [
- {
- "data": {
- "text/html": [
- "\n",
- "\n",
- "
\n",
- " \n",
- " \n",
- " | \n",
- " Article | \n",
- " Nombre d'apparitions | \n",
- " Moyenne du nombre de vus | \n",
- " Notes/about | \n",
- " Class | \n",
- "
\n",
- " \n",
- " \n",
- " \n",
- " | 0 | \n",
- " 1989 Tiananmen Square protests | \n",
- " 1 | \n",
- " 475850 | \n",
- " [32 years since the Chinese government responded to students protesting by sending in rifles and tanks (in the latter case, inspiring an iconic image, which seemingly vanished from Bing last week). And they have since done everything in their power to deny the violent events that ironically happened in a \"Heavenly Peace Square\", down to cracking down a vigil that happened in Hong Kong last week. Or somehow comparing the events to the invasion of the U.S. Capitol in January, when that one didn't end with hundreds of unarmed civilians being shot dead.] | \n",
- " B-Class article | \n",
- "
\n",
- " \n",
- " | 1 | \n",
- " 2012 Summer Olympics medal table | \n",
- " 1 | \n",
- " 541757 | \n",
- " [Because people need something to compare #8 to. Though why 2012 got more views than 2016? Well, it probably has to do with how before Tokyo (#1), London was India's (#4) best performance.] | \n",
- " Featured article | \n",
- "
\n",
- " \n",
- " | 2 | \n",
- " 2016 West Bengal Legislative Assembly election | \n",
- " 1 | \n",
- " 564338 | \n",
- " [With #8, we wanted to remind themselves of what happened last time out.] | \n",
- " C-Class article | \n",
- "
\n",
- " \n",
- " | 3 | \n",
- " 2019 Canadian federal election | \n",
- " 1 | \n",
- " 679216 | \n",
- " [Two years ago, the Liberals didn't get a majority in the parliament. They tried again this year (#5), and still couldn't.] | \n",
- " B-Class article | \n",
- "
\n",
- " \n",
- " | 4 | \n",
- " 2020 Summer Olympics | \n",
- " 5 | \n",
- " 2123836.6 | \n",
- " [The opening ceremonies are scheduled for July 23. The games were planned to showcase how Japan bounced back from the 2011 tsunami, but instead became overshadowed by the COVID-19 pandemic. The pandemic has so far pushed back the games by a year, caused a regional lockdown weeks before the opening, and prompted the creation of a condom-free-but-probably-not-sex-free Olympic village. Doesn't help that in spite of being one body of water away from the pandemic's origin, Japan took very long to start vaccinating its population and now the populace is afraid of a COVID resurgence., Sports fans now will spend two weeks in the very unfavorable Japan Standard Time to see the multi-sport event that started in spite of the goddamned pandemic delaying it for a year (and the protests of a populace that started immunizing itself too late). One of #1's teammates, Jordan Nwora, will compete for the Nigerian basketball team - Giannis' Greece certainly missed him as they couldn't qualify., Sports fans of the New World (and maybe a few of the Old one as well) are currently sleep-deprived to fit into the Tokyo Standard Time where the biggest multi-sport event is happening. The pandemic that delayed the Olympics is still having its effects felt, with medalists forced to wear masks on the podium and such., One more week where sports fans supported their countries from a distance, even in host city Tokyo as the same pandemic that delayed it for a year forced events without outsiders or reduced crowds. The Games closed on the Sunday this Report was published, to the relief of those who are losing their sleep to watch events late at night. At least the next ones are only three years away and the winter one is 6 months from now!, An iconic talk show host whose career spanned over six decades, King passed away on Saturday last week, earning him a higher spot this week.] | \n",
- " C-Class article | \n",
- "
\n",
- " \n",
- " | ... | \n",
- " ... | \n",
- " ... | \n",
- " ... | \n",
- " ... | \n",
- " ... | \n",
- "
\n",
- " \n",
- " | 655 | \n",
- " ZZ Top | \n",
- " 1 | \n",
- " 575452 | \n",
- " [A bearded Texan rock trio - including a guy actually named Beard, who after decades with only a mustache grew a goatee - who ever since 1969 have been responsible for many classic tunes such as \"La Grange\" and \"Sharp Dressed Man\" (and for Back to the Future fans, \"Doubleback\"). One of their frontmen died (#9) 5 days after the band played without him for the first time in 51 years - the other leader, Billy Gibbons, said Hill specifically asked him to put his guitar tech in his place.] | \n",
- " B-Class article | \n",
- "
\n",
- " \n",
- " | 656 | \n",
- " Zack Snyder | \n",
- " 2 | \n",
- " 651564.0 | \n",
- " [A director with a certainly distinctive style, translated into many comic book adaptations - most recently our #1 (and indirectly, #5). This year also has Snyder's heist-during-the-zombie-apocalypse flick Army of the Dead., Snyder is one of the clearer beneficiaries of vulgar auteurism; how else do you go from making the reviled-by-nerds 300 and Watchmen to getting those same nerds to push for millions of dollars in reshoots for a superhero movie?] | \n",
- " C-Class article | \n",
- "
\n",
- " \n",
- " | 657 | \n",
- " Zack Snyder's Justice League | \n",
- " 4 | \n",
- " 1632921.75 | \n",
- " [The Justice League that was released in 2017 had Snyder credited as a director, but much of the film was altered in post-production and reshoots by Joss Whedon. Some fans were convinced that the \"Snyder cut,\" which surely existed somewhere, was far better than the lackluster Whedon version. However, due to the way effects-heavy superhero films are produced, the cut was probably far from complete (Folding Ideas has a good video about why.) After $70 million of effects and reshoots, the Snyder cut is finally complete - with a trailer dropping this last Sunday., All the cries of #ReleaseTheSnyderCut were filled, at the asking price of an HBO Max subscription and 4 hours to kill. Reviews were positive, even if the indulgent nature of this superhero project (again: the thing lasts 4 hours!) is a contentious point., On one hand, the film is more consistent and the added content fleshes out characters such as Cyborg and Steppenwolf. On the other hand, it's also consistently too grim (the washed colors and moody soundtrack help) and the amount of slow motion is abusive. Still, a valid if overlong effort. Now let's see what lies ahead, even if fans aren't ready to move on (#21)., Fans are now griping #RestoreTheSnyderverse given the extended cut of Justice League was released, but Warner Bros. won't continue it. Well, they should focus on how the studio makes worse decisions regarding the DC Extended Universe than that - just because Darkseid appears in a half a dozen scenes of this 4 hour movie, Warner cancelled a promising New Gods movie that would feature the famed DC villain.] | \n",
- " C-Class article | \n",
- "
\n",
- " \n",
- " | 658 | \n",
- " Zitkala-Sa | \n",
- " 1 | \n",
- " 851153 | \n",
- " [This early-20th-century Yankton Dakota activist and author was born on February 22, 1876. For her birthday, she was commemorated by a Google Doodle.] | \n",
- " B-Class article | \n",
- "
\n",
- " \n",
- " | 659 | \n",
- " Zodiac Killer | \n",
- " 1 | \n",
- " 1088838 | \n",
- " [The Case Breakers, an independent team of 40 cold case investigators, claimed they identified this still mysterious murderer who terrorized the San Francisco Bay Area in the late 1960s. The police disagrees with their discovery, deeming it too reliant on circumstantial evidence.] | \n",
- " B-Class article | \n",
- "
\n",
- " \n",
- "
\n",
- "
660 rows × 5 columns
\n",
- "
"
- ],
- "text/plain": [
- " Article Nombre d'apparitions \\\n",
- "0 1989 Tiananmen Square protests 1 \n",
- "1 2012 Summer Olympics medal table 1 \n",
- "2 2016 West Bengal Legislative Assembly election 1 \n",
- "3 2019 Canadian federal election 1 \n",
- "4 2020 Summer Olympics 5 \n",
- ".. ... ... \n",
- "655 ZZ Top 1 \n",
- "656 Zack Snyder 2 \n",
- "657 Zack Snyder's Justice League 4 \n",
- "658 Zitkala-Sa 1 \n",
- "659 Zodiac Killer 1 \n",
- "\n",
- " Moyenne du nombre de vus \\\n",
- "0 475850 \n",
- "1 541757 \n",
- "2 564338 \n",
- "3 679216 \n",
- "4 2123836.6 \n",
- ".. ... \n",
- "655 575452 \n",
- "656 651564.0 \n",
- "657 1632921.75 \n",
- "658 851153 \n",
- "659 1088838 \n",
- "\n",
- " Notes/about \\\n",
- "0 [32 years since the Chinese government responded to students protesting by sending in rifles and tanks (in the latter case, inspiring an iconic image, which seemingly vanished from Bing last week). And they have since done everything in their power to deny the violent events that ironically happened in a \"Heavenly Peace Square\", down to cracking down a vigil that happened in Hong Kong last week. Or somehow comparing the events to the invasion of the U.S. Capitol in January, when that one didn't end with hundreds of unarmed civilians being shot dead.] \n",
- "1 [Because people need something to compare #8 to. Though why 2012 got more views than 2016? Well, it probably has to do with how before Tokyo (#1), London was India's (#4) best performance.] \n",
- "2 [With #8, we wanted to remind themselves of what happened last time out.] \n",
- "3 [Two years ago, the Liberals didn't get a majority in the parliament. They tried again this year (#5), and still couldn't.] \n",
- "4 [The opening ceremonies are scheduled for July 23. The games were planned to showcase how Japan bounced back from the 2011 tsunami, but instead became overshadowed by the COVID-19 pandemic. The pandemic has so far pushed back the games by a year, caused a regional lockdown weeks before the opening, and prompted the creation of a condom-free-but-probably-not-sex-free Olympic village. Doesn't help that in spite of being one body of water away from the pandemic's origin, Japan took very long to start vaccinating its population and now the populace is afraid of a COVID resurgence., Sports fans now will spend two weeks in the very unfavorable Japan Standard Time to see the multi-sport event that started in spite of the goddamned pandemic delaying it for a year (and the protests of a populace that started immunizing itself too late). One of #1's teammates, Jordan Nwora, will compete for the Nigerian basketball team - Giannis' Greece certainly missed him as they couldn't qualify., Sports fans of the New World (and maybe a few of the Old one as well) are currently sleep-deprived to fit into the Tokyo Standard Time where the biggest multi-sport event is happening. The pandemic that delayed the Olympics is still having its effects felt, with medalists forced to wear masks on the podium and such., One more week where sports fans supported their countries from a distance, even in host city Tokyo as the same pandemic that delayed it for a year forced events without outsiders or reduced crowds. The Games closed on the Sunday this Report was published, to the relief of those who are losing their sleep to watch events late at night. At least the next ones are only three years away and the winter one is 6 months from now!, An iconic talk show host whose career spanned over six decades, King passed away on Saturday last week, earning him a higher spot this week.] \n",
- ".. ... \n",
- "655 [A bearded Texan rock trio - including a guy actually named Beard, who after decades with only a mustache grew a goatee - who ever since 1969 have been responsible for many classic tunes such as \"La Grange\" and \"Sharp Dressed Man\" (and for Back to the Future fans, \"Doubleback\"). One of their frontmen died (#9) 5 days after the band played without him for the first time in 51 years - the other leader, Billy Gibbons, said Hill specifically asked him to put his guitar tech in his place.] \n",
- "656 [A director with a certainly distinctive style, translated into many comic book adaptations - most recently our #1 (and indirectly, #5). This year also has Snyder's heist-during-the-zombie-apocalypse flick Army of the Dead., Snyder is one of the clearer beneficiaries of vulgar auteurism; how else do you go from making the reviled-by-nerds 300 and Watchmen to getting those same nerds to push for millions of dollars in reshoots for a superhero movie?] \n",
- "657 [The Justice League that was released in 2017 had Snyder credited as a director, but much of the film was altered in post-production and reshoots by Joss Whedon. Some fans were convinced that the \"Snyder cut,\" which surely existed somewhere, was far better than the lackluster Whedon version. However, due to the way effects-heavy superhero films are produced, the cut was probably far from complete (Folding Ideas has a good video about why.) After $70 million of effects and reshoots, the Snyder cut is finally complete - with a trailer dropping this last Sunday., All the cries of #ReleaseTheSnyderCut were filled, at the asking price of an HBO Max subscription and 4 hours to kill. Reviews were positive, even if the indulgent nature of this superhero project (again: the thing lasts 4 hours!) is a contentious point., On one hand, the film is more consistent and the added content fleshes out characters such as Cyborg and Steppenwolf. On the other hand, it's also consistently too grim (the washed colors and moody soundtrack help) and the amount of slow motion is abusive. Still, a valid if overlong effort. Now let's see what lies ahead, even if fans aren't ready to move on (#21)., Fans are now griping #RestoreTheSnyderverse given the extended cut of Justice League was released, but Warner Bros. won't continue it. Well, they should focus on how the studio makes worse decisions regarding the DC Extended Universe than that - just because Darkseid appears in a half a dozen scenes of this 4 hour movie, Warner cancelled a promising New Gods movie that would feature the famed DC villain.] \n",
- "658 [This early-20th-century Yankton Dakota activist and author was born on February 22, 1876. For her birthday, she was commemorated by a Google Doodle.] \n",
- "659 [The Case Breakers, an independent team of 40 cold case investigators, claimed they identified this still mysterious murderer who terrorized the San Francisco Bay Area in the late 1960s. The police disagrees with their discovery, deeming it too reliant on circumstantial evidence.] \n",
- "\n",
- " Class \n",
- "0 B-Class article \n",
- "1 Featured article \n",
- "2 C-Class article \n",
- "3 B-Class article \n",
- "4 C-Class article \n",
- ".. ... \n",
- "655 B-Class article \n",
- "656 C-Class article \n",
- "657 C-Class article \n",
- "658 B-Class article \n",
- "659 B-Class article \n",
- "\n",
- "[660 rows x 5 columns]"
- ]
- },
- "execution_count": 20,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
+ "outputs": [],
"source": [
- "result = pd.merge(new_data, df_classe, on=['Article'])\n",
- "result"
+ "data[\"Views\"][data.Article==new_data.Article[4]].tolist()"
]
},
{
"cell_type": "code",
- "execution_count": 21,
+ "execution_count": null,
"metadata": {},
"outputs": [],
- "source": [
- "result.to_json(r'C:\\Users\\coral\\Documents\\E5\\ProjetFullStack\\df.json')"
- ]
+ "source": []
}
],
"metadata": {
diff --git a/README.md b/README.md
index e386d4f..8a2c8d4 100644
--- a/README.md
+++ b/README.md
@@ -6,6 +6,18 @@ Décorréler la partie backend du frontend (ie: créer deux sous projets avec de
Backend : FastApi, Flask, Django, etc (de préférence utiliser les technos du cours)
Frontend : Vue, Angular, React, Flask, etc
Créer au moins une partie utilisateur et une partie formulaire
-Bonus pour l'utilisation de Kong + Keycloak (Facultatif)
---------------------------------------
+WikiNav répertorie les pages wikipédia les plus lues entre le 3 janvier et le 30 octobre dans le monde.
+En tant qu'utilisateurs, vous avez accès à cette liste d'articles et à une barre de recherche.
+Si vous accédez à un article, vous saurez combien de semaine il a été en tendance, le nombre de vu en moyenne, le résumé de pourquoi cela a suscité tant d'intérêt et sa classe Wikipédia.
+Vous pouvez également mettre un article qui vous intéresse en favoris.
+
+
+Vous pourrez trouver le Scrapper dans le dossier principal puis dans le dossier testdjangodock, la partie concernant le site dont le backend et dans le dossier Template le frontend.
+
+Pour accéder à l'API, il suffit de faire un "docker-compose up -d" dans testdjangodock puis d'ouvrir le lien 127.0.0.1:8000.
+
+Framework backend : Django
+Base de données : Sqlite3
+Framework frontend : Bootstrap de CSS
diff --git a/Scrapper.ipynb b/Scrapper.ipynb
index db7f202..9df3b48 100644
--- a/Scrapper.ipynb
+++ b/Scrapper.ipynb
@@ -44,7 +44,7 @@
"\n",
"\n",
"Wikipedia:Top 25 Report/October 17 to 23, 2021 - Wikipedia\n",
- "\n",
"\n",
+ "\n",
"