TensorFlow Keras 自然语言情感分类模型

发布于 · 2023-04-20 · # 默认分类 # AI 模型 # Anaconda # RNN # LSTM # Bi-LSTM # TensorFlow # 二分类模型

⭐ TensorFlow Keras 自然语言情感分类模型 ⭐

🤔 这是什么

使用 TensorFlow Keras 训练的自然语言情感二分类模型
拥有 pos 和 neg 数据集标签，各 2000 条 txt 文本数据
选择 RNN、LSTM 和 Bi-LSTM 作为模型，借助 Keras 搭建训练
主要软件包版本为 TensorFlow 2.5.2、Keras 2.3.1 和 Python 3.6.2
在测试数据集上可稳定达到 91% 准确度

⚙ 部署

克隆本项目仓库到本地

git clone https://gitlab.soraharu.com/XiaoXi/TensorFlow-Keras-Natural-Language-Emotion-Classification-Model.git

使用 Anaconda 为项目创建虚拟环境

conda create --name nlec python=3.6.2

激活虚拟环境

conda activate nlec

更新 pip

pip3 install --user --upgrade pip

安装 Python 库依赖组件

pip3 install -r requirements.txt

下载封装好的中文词向量，本项目选择的是 Zhihu_QA Word + Ngram，并放在 res/word-vector 目录下
编辑 generic_utils.py，在文件尾部加入如下代码

def populate_dict_with_module_objects(target_dict, modules, obj_filter):
  for module in modules:
    for name in dir(module):
      obj = getattr(module, name)
      if obj_filter(obj):
        target_dict[name] = obj


def to_snake_case(s):
    return ''.join(['_' + ch.lower() if ch.isupper() else ch for ch in str(s)]).lstrip('_')

该文件应在 Anaconda 安装目录下的 envs/nlec/Lib/site-packages/keras/utils/generic_utils.py

安装 Graphviz 并导入系统环境目录

🏃 运行

执行 src/run.py

python src/run.py

调整常用参数

my_lr = 1e-2 # 初始学习率
my_test_size = 0.1
my_validation_split = 0.1 # 验证集比例
my_epochs = 40 # 训练轮数
my_batch_size = 128 # 批大小
my_dropout = 0.2 # dropout参数大小

my_optimizer = Nadam(lr=my_lr) # 优化方法
my_loss = 'binary_crossentropy' # 损失函数

📄 文件大纲

├───.gitignore
├───LICENSE
├───README.md
├───requirements.txt
├───res
│   ├───datanew
│   │   ├───neg (2000 negative txt)
│   │   └───pos (2000 positive txt)
│   └───word-vector
│       ├───README.txt
│       └───sgns.zhihu.bigram.bz2
├───src
│   └───run.py
└───tmp
    ├───model.png
    ├───README.txt
    └───weights.hdf5

📜 开源许可

基于 MIT License 许可进行开源。

TensorFlow Keras 自然语言情感分类模型

🤔 这是什么

⚙ 部署

🏃 运行

📄 文件大纲

📜 开源许可

添加新评论取消回复

仅有一条评论

TensorFlow Keras 自然语言情感分类模型

🤔 这是什么

⚙ 部署

🏃 运行

📄 文件大纲

📜 开源许可

添加新评论 取消回复

仅有一条评论

添加新评论取消回复