一、爬取目标
二、完整源码
#""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
#
# Copyright (c) 2024 愤怒的it男, All Rights Reserved.
# FileName : code.py
# Date : 2024.02.03
# Author : 愤怒的it男
# Version : 1.0.0
# Node : 欢迎关注微信公众号【愤怒的it男】
#
#""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
import csv
import json
import requests
from tabulate import tabulate
def getData(url, headers):
response = requests.get(url=url, headers=headers)
rows = json.loads(response.text)['rows']
result = []
for row in rows:
d = [row['hs_Rank_GTop500_Ranking'], row['hs_Rank_GTop500_Wealth'], row['hs_Rank_GTop500_ComName_Cn'], row['hs_Rank_GTop500_ChaName_Cn'], row['hs_Rank_GTop500_Industry_Cn']]
result.append(d)
return result
def printData(result):
headers = ["排名", "企业估值(亿)", "企业信息", "CEO", "行业"]
table = tabulate(result, headers=headers, tablefmt='grid')
print(table)
def saveData(result):
with open('胡润世界500强.csv', 'w', encoding='utf-8', newline='') as file:
writer = csv.writer(file)
writer.writerows(result)
def main():
baseUrl = 'https://www.hurun.net/zh-CN/Rank/HsRankDetailsList?num=XLM27SV9&offset=0&limit=500'
headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.3'}
result = getData(baseUrl, headers)
printData(result)
saveData(result)
if __name__== "__main__" :
main()
三、爬取结果
+--------+------------------+------------------------------+-------------------------------------+------------------+
| 排名 | 企业估值(亿) | 企业信息 | CEO | 行业 |
+========+==================+==============================+=====================================+==================+
| 1 | 192290 | 苹果 | 蒂姆·库克 | 消费品 |
+--------+------------------+------------------------------+-------------------------------------+------------------+
| 2 | 181940 | 微软 | 萨提亚·纳德拉 | 软件与服务 |
+--------+------------------+------------------------------+-------------------------------------+------------------+
| 3 | 112440 | Alphabet | 桑达尔·皮查伊 | 传媒和娱乐 |
+--------+------------------+------------------------------+-------------------------------------+------------------+
| 4 | 99550 | 亚马逊 | 安迪·贾西 | 零售 |
+--------+------------------+------------------------------+-------------------------------------+------------------+
| 5 | 72910 | 英伟达 | 黄仁勋 | 半导体 |
+--------+------------------+------------------------------+-------------------------------------+------------------+
| 6 | 59010 | 脸书 | 马克·扎克伯格 | 传媒和娱乐 |
+--------+------------------+------------------------------+-------------------------------------+------------------+
| 7 | 54370 | 伯克希尔哈撒韦 | 沃伦·巴菲特 | 投资 |
+--------+------------------+------------------------------+-------------------------------------+------------------+
| 8 | 46190 | 特斯拉 | 埃隆·马斯克 | 汽车及汽车零部件 |
+--------+------------------+------------------------------+-------------------------------------+------------------+
| 9 | 38080 | 礼来 | David A. Ricks | 医疗健康 |
+--------+------------------+------------------------------+-------------------------------------+------------------+
| 10 | 37070 | 台积电 | 魏哲家 | 半导体 |
+--------+------------------+------------------------------+-------------------------------------+------------------+
| 11 | 35840 | 联合健康集团 | Dave Wichmann | 金融服务 |
+--------+------------------+------------------------------+-------------------------------------+------------------+
| 12 | 34460 | 维萨 | 艾克礼 | 金融服务 |
+--------+------------------+------------------------------+-------------------------------------+------------------+
| 13 | 31860 | 沃尔玛 | 董明伦 | 零售 |
+--------+------------------+------------------------------+-------------------------------------+------------------+
| 14 | 31350 | 诺和诺德 | Lars Fruergaard Jørgensen | 医疗健康 |
+--------+------------------+------------------------------+-------------------------------------+------------------+
| 15 | 30340 | 埃克森美孚 | Darren Woods | 能源 |
+--------+------------------+------------------------------+-------------------------------------+------------------+
| 16 | 29100 | 摩根大通 | 杰米·戴蒙 | 金融服务 |
+--------+------------------+------------------------------+-------------------------------------+------------------+
| 17 | 28450 | 腾讯控股 | 马化腾 | 传媒和娱乐 |
+--------+------------------+------------------------------+-------------------------------------+------------------+
| 18 | 25850 | 强生 | 亚历克斯·戈尔斯基 | 医疗健康 |
+--------+------------------+------------------------------+-------------------------------------+------------------+
| 19 | 25630 | 路威酩轩集团 | 贝尔纳·阿尔诺 | 消费品 |
+--------+------------------+------------------------------+-------------------------------------+------------------+
| 19 | 25630 | 宝洁 | 大卫·泰勒 | 消费品 |
+--------+------------------+------------------------------+-------------------------------------+------------------+
| 21 | 25560 | 万事达卡 | Ajaypal Singh Banga | 金融服务 |
+--------+------------------+------------------------------+-------------------------------------+------------------+
| 22 | 25410 | 三星电子 | 权五铉 | 消费品 |
+--------+------------------+------------------------------+-------------------------------------+------------------+
| 23 | 25120 | 博通 | 陈福阳 | 半导体 |
+--------+------------------+------------------------------+-------------------------------------+------------------+
| 24 | 20630 | 家得宝 | Craig Menear | 零售 |
+--------+------------------+------------------------------+-------------------------------------+------------------+
| 25 | 20490 | 甲骨文 | Safra Catz | 软件与服务 |
+--------+------------------+------------------------------+-------------------------------------+------------------+
| 26 | 19840 | 雪佛龙 | Michael Wirth | 能源 |
+--------+------------------+------------------------------+-------------------------------------+------------------+
| 27 | 19040 | 雀巢 | Ulf Mark Schneider | 食品饮料 |
+--------+------------------+------------------------------+-------------------------------------+------------------+
| 28 | 18820 | 默克集团 | Kenneth C. Frazier | 医疗健康 |
+--------+------------------+------------------------------+-------------------------------------+------------------+
| 29 | 18750 | 丰田汽车 | 丰田章男 | 汽车及汽车零部件 |
+--------+------------------+------------------------------+-------------------------------------+------------------+
| 30 | 18030 | 艾伯维 | Richard A. Gonzalez | 医疗健康 |
+--------+------------------+------------------------------+-------------------------------------+------------------+
| 31 | 17740 | 开市客 | W. Craig Jelinek | 零售 |
+--------+------------------+------------------------------+-------------------------------------+------------------+
| 32 | 17670 | 可口可乐 | James Quincey | 食品饮料 |
+--------+------------------+------------------------------+-------------------------------------+------------------+
| 33 | 17590 | 奥多比 | Shantanu Narayen | 软件与服务 |
+--------+------------------+------------------------------+-------------------------------------+------------------+
| 34 | 17380 | 壳牌 | Ben van Beurden | 能源 |
+--------+------------------+------------------------------+-------------------------------------+------------------+
| 35 | 17010 | 阿斯麦控股 | Peter Wennink | 半导体 |
+--------+------------------+------------------------------+-------------------------------------+------------------+
| 36 | 16220 | 百事公司 | Ramon Laguarta | 食品饮料 |
+--------+------------------+------------------------------+-------------------------------------+------------------+
| 37 | 16150 | 欧莱雅 | Jean-Paul Agon | 消费品 |
+--------+------------------+------------------------------+-------------------------------------+------------------+
| 38 | 15710 | 阿里巴巴 | 吴泳铭 | 零售 |
+--------+------------------+------------------------------+-------------------------------------+------------------+
| 39 | 15280 | 思科 | Charles H. Robbins | 电信 |
+--------+------------------+------------------------------+-------------------------------------+------------------+
| 40 | 15130 | 美国银行 | Brian Moynihan | 金融服务 |
+--------+------------------+------------------------------+-------------------------------------+------------------+
| 41 | 14770 | 罗氏公司 | Severin Schwan | 医疗健康 |
+--------+------------------+------------------------------+-------------------------------------+------------------+
| 42 | 14620 | 必和必拓集团 | Mike Henry | 金属与矿产 |
+--------+------------------+------------------------------+-------------------------------------+------------------+
| 43 | 14480 | 字节跳动 | 梁汝波 | 传媒和娱乐 |
+--------+------------------+------------------------------+-------------------------------------+------------------+
| 44 | 14340 | 埃森哲 | Julie Sweet | 软件与服务 |
+--------+------------------+------------------------------+-------------------------------------+------------------+
| 44 | 14340 | 信实工业 | Mukesh Ambani | 电信 |
+--------+------------------+------------------------------+-------------------------------------+------------------+
| 46 | 14120 | 阿斯利康 | Pascal Soriot | 医疗健康 |
+--------+------------------+------------------------------+-------------------------------------+------------------+
| 46 | 14120 | 赛富时 | Marc Benioff | 软件与服务 |
+--------+------------------+------------------------------+-------------------------------------+------------------+
| 48 | 14050 | 爱马仕 | Axel Dumas | 消费品 |
+--------+------------------+------------------------------+-------------------------------------+------------------+
| 49 | 13760 | 麦当劳 | Chris Kempczinski | 食品饮料 |
+--------+------------------+------------------------------+-------------------------------------+------------------+
| 50 | 13030 | 奈飞 | Reed Hastings | 传媒和娱乐 |
+--------+------------------+------------------------------+-------------------------------------+------------------+
| 50 | 13030 | 诺华 | Vasant Narasimhan | 医疗健康 |
+--------+------------------+------------------------------+-------------------------------------+------------------+
四、免责声明
本文所涉及源代码均为个人学习所编写,并于公众号(愤怒的it男)中原创发布,仅可用于计算机技术学习及研究等合法行为,禁止利用本文所涉及源代码从事任何违反本国(地区)法律法规的业务,如有发现存在违法违规行为我会举报到网监部门。
原文始发于微信公众号(愤怒的it男):Python爬虫|世界500强!
版权声明:本文内容由互联网用户自发贡献,该文观点仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 举报,一经查实,本站将立刻删除。
文章由极客之音整理,本文链接:https://www.bmabk.com/index.php/post/234289.html