python——统计一个文本中单词频次最高的10个单词

飞熊 • 2023年3月5日上午10:19 • Python • 阅读 287

在人生的道路上，不管是潇洒走一回，或者是千山独行，皆须是自己想走的路，虽然，有的人并不是很快就能找到自己的方向和道路，不过，只要坚持到底，我相信，就一定可以找到自己的路，只要找到路，就不必怕路途遥远了。

导读：本篇文章讲解 python——统计一个文本中单词频次最高的10个单词，希望对大家有帮助，欢迎收藏，转发！站点地址：www.bmabk.com，来源：原文

文本内容：

我是中国人
word_list
我是中国人
word_list
我是中国人
word!list
我是中国人
word!list
wordlist
kobe
china
china
china
butler
hello
哈喽
world
world

代码展示：

import re


def fun_count():
    word_list = []
    with open("test.txt", "r", encoding='utf-8') as f:
        f_file = f.readlines()
        for line in f_file:
            if line.strip() != '':
                line_word = re.findall(r"[a-zA-Z]+", line)  # todo 利用切片把Word取出来，返回是一个list
                word_list += line_word  # todo 把Word汇总成一个list

        return word_list


def main():
    word_dict = {}
    word_dict1={}
    result = fun_count()
    for item in result:
        if item not in word_dict:
            word_dict[item] = 1
        else:
            word_dict[item] += 1

    word_result=sorted(word_dict.items(),key=lambda x:x[1],reverse=True)

    for item1 in word_result:
        word_dict1[item1[0]]=item1[1]
    return word_dict1


if __name__ == '__main__':
    res = main()
    print(res)

执行结果：

{'word': 4, 'list': 4, 'china': 3, 'world': 2, 'wordlist': 1, 'kobe': 1, 'butler': 1, 'hello': 1}

版权声明：本文内容由互联网用户自发贡献，该文观点仅代表作者本人。本站仅提供信息存储空间服务，不拥有所有权，不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容，请发送邮件至举报，一经查实，本站将立刻删除。

文章由极客之音整理，本文链接：https://www.bmabk.com/index.php/post/123209.html

赞 (0)

0 0

Python

比文件操作os库更优异的标准库pathlib

000304

小半
2024年1月7日
Python

Python性能分析工具cProfile：最强性能优化利器

00093

青莲明月
2025年2月10日
Python

python基础-引用赋值与按值赋值

000271

飞熊
2023年3月5日
Python

Python tuple方法和string常量

000309

python学霸
2024年4月15日
Python

【python】python 数学相关模块——math/decimal模块

000646

小半
2022年10月28日
Python

Docker 容器和镜像相关操作(纯命令版)

000219

小半
2022年12月12日
Python

解决关于 ERROR 1290 (HY000): The MySQL server is running with the –skip-grant-tables option so it….报错

000662

飞熊
2023年3月5日
Python

bluetooth-locker，一个Python中非常有用的库

000631

小半
2024年9月9日
Python

gorm 关于mysql你确定知道这些?

000259

小半
2022年12月12日
Python

MNIST手写数字识别（二）几种模型优化方式介绍

000280

seven_
2023年2月21日
Python

HDU–1062:Text Reverse (字符串)

000448

飞熊
2023年9月5日
Python

pytest fixture及conftest详解三 (pytest.ini配置文件)

000277

飞熊
2023年3月5日

发表回复

登录后才能评论

极客之音——专业性很强的中文编程技术网站，欢迎收藏到浏览器，订阅我们！