Group 10

Group 10 - Web Scraping using BeautifulSoup

Name	Matric Number
FARAH IRDINA BINTI AHMAD BAHARUDIN	A20EC0035
LOW JUNYI	A20EC0071
NURFARRAHIN BINTI CHE ALIAS	A20EC0121
YONG ZHI YAN	A20EC0172

Beautiful Soup is a Python library that is used for web scraping. It allows you to parse the HTML or XML documents into a readable tree-like format, and then extract data from the tree based on its structure. With Beautiful Soup, you can easily navigate through the document, search for specific tags, and extract the text or attributes of those tags. It is often used in combination with other libraries such as requests to programmatically access web pages and extract data from them. The website that we will be using is from https://www.studymalaysia.com/education/top-stories/list-of-universities-in-malaysia.

This website is a resource for individuals interested in higher education in Malaysia. It provides a comprehensive list of universities in Malaysia, including both public and private institutions. The website also includes information about the universities' locations, programs offered, and contact information. Additionally, the website provides articles and news related to education and universities in Malaysia, as well as resources for students and parents. The website appears to be operated by StudyMalaysia Group, which is a provider of education and career guidance in Malaysia.

We plan to obtain data from the website by extracting one of its tables, specifically the list of 20 Public Universities in Malaysia. By analyzing the website's code, we will locate the table and access it using the 'full boxed' class. We will then utilize the pandas library and the BeautifulSoup package to extract the information from the table in html format. Finally, we will convert the obtained data into a CSV file. In summary, we will efficiently retrieve various tables and contents from the website using these tools.

Name		Name	Last commit message	Last commit date
parent directory ..
20 Public Universities in Malaysia.csv		20 Public Universities in Malaysia.csv
Public_University_in_Malaysia.ipynb		Public_University_in_Malaysia.ipynb
readme.md		readme.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

readme.md

Group 10 - Web Scraping using BeautifulSoup

FilesExpand file tree

Group 10

Directory actions

More options

Directory actions

More options

Latest commit

History

Group 10

Folders and files

parent directory

readme.md

Group 10 - Web Scraping using BeautifulSoup