当前仓库属于关闭状态,部分功能使用受限,详情请查阅 仓库状态说明
1 Star 0 Fork 368

php_java / 中药图片拍照识别系统-后端
关闭

加入 Gitee
与超过 1200万 开发者一起发现、参与优秀开源项目,私有仓库也完全免费 :)
免费加入
克隆/下载
贡献代码
同步代码
取消
提示: 由于 Git 不支持空文件夾,创建文件夹后会生成空的 .keep 文件
Loading...
README
MulanPSL-1.0

中药图片拍照识别系统-后端

说明

当前项目是中药识别APP的后端工程,提供纯数据接口;移动端请移步中药图片拍照识别系统-移动APP端

版本说明 2.x

对版本1.x进行重构升级,重新组织了代码结构,移除了特别不稳定且重量级的组件Deeplearn4j。

项目介绍

本项目包含五个模块:

项目预览

文档

阅读文档

技术详情

  1. medicine-server服务器端工程

    Gradle构建

    SpringBoot框架,一键启动与部署

    文档数据库:MongoDB

    全文检索:Elasticsearch + IK分词器

    数据库:MySQL

  2. medicine-collection爬虫工程

    爬虫主要用来爬取训练集以及中药的详细信息,主要包含:中药名称、中药形态、图片、 别名、英文名、配伍药方、功效与作用、临床应用、产地分布、药用部位、 性味归经、药理研究、主要成分、使用禁忌、采收加工、药材性状等信息。

    爬虫框架:WebMagic(参考代码

    数据持久化:MongoDB

    数据结构(简略展示)

    • 中药一级分类信息

    • 中药详细信息

  3. medicine-model卷积神经网络工程

    Language: Python

    使用TensorFlow 深度学习框架,使用Keras会大幅缩减代码量

    训练机器:华为Atlas 200 AI开发板(或本地计算机)

    数据集

    常用的卷积网络模型及在ImageNet上的准确率

    模型 大小 Top-1准确率 Top-5准确率 参数数量 深度
    Xception 88 MB 0.790 0.945 22,910,480 126
    VGG16 528 MB 0.713 0.901 138,357,544 23
    VGG19 549 MB 0.713 0.900 143,667,240 26
    ResNet50 98 MB 0.749 0.921 25,636,712 168
    ResNet101 171 MB 0.764 0.928 44,707,176 -
    ResNet152 232 MB 0.766 0.931 60,419,944 -
    ResNet50V2 98 MB 0.760 0.930 25,613,800 -
    ResNet101V2 171 MB 0.772 0.938 44,675,560 -
    ResNet152V2 232 MB 0.780 0.942 60,380,648 -
    ResNeXt50 96 MB 0.777 0.938 25,097,128 -
    ResNeXt101 170 MB 0.787 0.943 44,315,560 -
    InceptionV3 92 MB 0.779 0.937 23,851,784 159
    InceptionResNetV2 215 MB 0.803 0.953 55,873,736 572
    MobileNet 16 MB 0.704 0.895 4,253,864 88
    MobileNetV2 14 MB 0.713 0.901 3,538,984 88
    DenseNet121 33 MB 0.750 0.923 8,062,504 121
    DenseNet169 57 MB 0.762 0.932 14,307,880 169
    DenseNet201 80 MB 0.773 0.936 20,242,984 201
    NASNetMobile 23 MB 0.744 0.919 5,326,716 -
    NASNetLarge 343 MB 0.825 0.960 88,949,818 -

    由于硬件条件限制,综合考虑模型的准确率、大小以及复杂度等因素,采用了Xception模型, 该模型是134层(包含激活层,批标准化层等)拓扑深度的卷积网络模型。

    Xception函数定义:

    def Xception(include_top=True,
        weights='imagenet',
        input_tensor=None,
        input_shape=None,
        pooling=None,
        classes=1000,
        **kwargs)
    
    # 参数
    # include_top:是否保留顶层的全连接网络
    # weights:None代表随机初始化,即不加载预训练权重。'imagenet’代表加载预训练权重
    # input_tensor:可填入Keras tensor作为模型的图像输入tensor
    # input_shape:可选,仅当include_top=False有效,应为长为3的tuple,指明输入图片的shape,图片的宽高必须大于71,如(150,150,3)
    # pooling:当include_top=False时,该参数指定了池化方式。None代表不池化,最后一个卷积层的输出为4D张量。‘avg’代表全局平均池化,‘max’代表全局最大值池化。
    # classes:可选,图片分类的类别数,仅当include_top=True并且不加载预训练权重时可用

    构建代码

    基于Xception的模型微调,详细请参考代码

    1. 设置Xception参数

      迁移学习参数权重加载:xception_weights

      # 设置输入图像的宽高以及通道数
      img_size = (299, 299, 3)
      
      base_model = keras.applications.xception.Xception(include_top=False,
                                                      weights='..\\resources\\keras-model\\xception_weights_tf_dim_ordering_tf_kernels_notop.h5',
                                                      input_shape=img_size,
                                                      pooling='avg')
      
      # 全连接层,使用softmax激活函数计算概率值,分类大小是628
      model = keras.layers.Dense(628, activation='softmax', name='predictions')(base_model.output)
      model = keras.Model(base_model.input, model)
      
      # 锁定卷积层
      for layer in base_model.layers:
        layer.trainable = False
    2. 全连接层训练(v1.0)

      from base_model import model
      
      # 设置训练集图片大小以及目录参数
      img_size = (299, 299)
      dataset_dir = '..\\dataset\\dataset'
      img_save_to_dir = 'resources\\image-traing\\'
      log_dir = 'resources\\train-log'
      
      model_dir = 'resources\\keras-model\\'
      
      # 使用数据增强
      train_datagen = keras.preprocessing.image.ImageDataGenerator(
          rescale=1. / 255,
          shear_range=0.2,
          width_shift_range=0.4,
          height_shift_range=0.4,
          rotation_range=90,
          zoom_range=0.7,
          horizontal_flip=True,
          vertical_flip=True,
          preprocessing_function=keras.applications.xception.preprocess_input)
      
      test_datagen = keras.preprocessing.image.ImageDataGenerator(
          preprocessing_function=keras.applications.xception.preprocess_input)
      
      train_generator = train_datagen.flow_from_directory(
          dataset_dir,
          save_to_dir=img_save_to_dir,
          target_size=img_size,
          class_mode='categorical')
      
      validation_generator = test_datagen.flow_from_directory(
          dataset_dir,
          save_to_dir=img_save_to_dir,
          target_size=img_size,
          class_mode='categorical')
      
      # 早停法以及动态学习率设置
      early_stop = EarlyStopping(monitor='val_loss', patience=13)
      reduce_lr = ReduceLROnPlateau(monitor='val_loss', patience=7, mode='auto', factor=0.2)
      tensorboard = keras.callbacks.tensorboard_v2.TensorBoard(log_dir=log_dir)
      
      for layer in model.layers:
          layer.trainable = False
      
      # 模型编译
      model.compile(optimizer='rmsprop', loss='categorical_crossentropy', metrics=['accuracy'])
      
      history = model.fit_generator(train_generator,
                                    steps_per_epoch=train_generator.samples // train_generator.batch_size,
                                    epochs=100,
                                    validation_data=validation_generator,
                                    validation_steps=validation_generator.samples // validation_generator.batch_size,
                                    callbacks=[early_stop, reduce_lr, tensorboard])
      # 模型导出
      model.save(model_dir + 'chinese_medicine_model_v1.0.h5')
    3. 对于顶部的6层卷积层,我们使用数据集对权重参数进行微调

       # 加载模型
       model=keras.models.load_model('resources\\keras-model\\chinese_medicine_model_v2.0.h5')
       
       for layer in model.layers:
          layer.trainable = False
       for layer in model.layers[126:132]:
          layer.trainable = True
      
       history = model.fit_generator(train_generator,
                                     steps_per_epoch=train_generator.samples // train_generator.batch_size,
                                     epochs=100,
                                     validation_data=validation_generator,
                                     validation_steps=validation_generator.samples // validation_generator.batch_size,
                                     callbacks=[early_stop, reduce_lr, tensorboard])
       model.save(model_dir + 'chinese_medicine_model_v2.0.h5')
      
    4. 在后端项目中,我们使用Deeplearn4j调用训练好的模型

      public class CnnModelUtil {
      
          private static ComputationGraph CNN_MODEL = null;
      
          /**
           * 中药名字的编码
           */
          private static final Map<Integer, String> MEDICINE_NAME_MAP = new HashMap<>();
      
          /**
           * 定义cnn model的文件夹路径
           */
          private static final String DATA_DIR = System.getProperty("os.name")
                  .toLowerCase().contains("windows") ? "D:\\data\\model\\"
                  : "./data/model/";
      
          /**
           * 定义中药编码表的文件名
           */
          private static final String MEDICINE_LABLE_FILE_NAME = "medicine_name-lable.txt";
      
          /**
           * 定义模型的文件名
           */
          private static final String CNN_MODEL_FILE_NAME = "chinese_medicine_model.h5";
      
      
          /**
           * 图片的加载器
           */
          private static final NativeImageLoader IMAGE_LOADER = new NativeImageLoader(299, 299, 3);
      
      
          /**
           * 初始化
           */
          static {
              try {
                  CNN_MODEL = KerasModelImport.importKerasModelAndWeights(DATA_DIR + CNN_MODEL_FILE_NAME);
      
                  Files.readAllLines(Paths.get(DATA_DIR, MEDICINE_LABLE_FILE_NAME)).forEach(v -> {
                      String[] split = v.split(",");
                      MEDICINE_NAME_MAP.put(Integer.valueOf(split[1]), split[0]);
                  });
              } catch (IOException | InvalidKerasConfigurationException | UnsupportedKerasConfigurationException e) {
                  e.printStackTrace();
              }
          }
      
          /**
           * 对图像进行预测
           * 对预测的概率值进行排序处理
           * 返回值是概率值前10的中药的名字
           * @param file
           * @return
           * @throws 
           */
          public static Map<String, Float> medicineNamePredict(File file) throws IOException {
              INDArray image = IMAGE_LOADER.asMatrix(file).divi(127.5).subi(1);
              INDArray output = CNN_MODEL.outputSingle(image);
              Map<Integer, Float> resultMap = new HashMap<>();
              float[] floats = output.toFloatVector();
              for (int i = 0; i < floats.length; i++) {
                  resultMap.put(i, floats[i]);
              }
              List<Map.Entry<Integer, Float>> resultList = new LinkedList<>(resultMap.entrySet());
              resultList.sort(Map.Entry.comparingByValue(Comparator.reverseOrder()));
              Map<String, Float> medicinePredict = new LinkedHashMap<>();
              resultList.stream().limit(10).forEach(v -> {
                  medicinePredict.put(MEDICINE_NAME_MAP.get(v.getKey()), v.getValue());
              });
              return medicinePredict;
          }
      }

    模型概览

    模型详细结构

    训练过程正确率以及损失函数可视化展示

    正确率 损失函数

  4. medicine-dataset数据集

  5. medicine-util公用工具类

依赖环境说明

依赖 版本
JDK 8+
Python 3.6
Maven 3.0+
TensorFlow 2.0
mongoDB 4.2.2
mongo-java-driver 3.12
MySQL 8.0+
Spring Boot 2.2.2
Elasticsearch 7.4.2
IK分词器 7.4.2
deeplearning4j 1.0.0-beta6
nd4j-native-platform 1.0.0-beta6

开源软件使用须知

  • 允许用于个人学习;
  • 开源版不适合商用;
  • 禁止将本项目的代码和资源进行任何形式的出售,产生的一切任何后果责任由侵权者自负;
  • LICENSE

交流、反馈与参与贡献

  • 如需关注项目最新动态,请Watch、Star项目,同时也是对项目最好的支持

  • 欢迎参与技术讨论、二次开发等咨询、问题和建议!

  • QQ:993021993

  • 微信:

    WECHAT

喜欢的朋友请star一下,让我更有信心的去维护,谢谢!^_^

木兰宽松许可证, 第1版 木兰宽松许可证, 第1版 2019年8月 http://license.coscl.org.cn/MulanPSL 您对“软件”的复制、使用、修改及分发受木兰宽松许可证,第1版(“本许可证”)的如下条款的约束: 0. 定义 “软件”是指由“贡献”构成的许可在“本许可证”下的程序和相关文档的集合。 “贡献者”是指将受版权法保护的作品许可在“本许可证”下的自然人或“法人实体”。 “法人实体”是指提交贡献的机构及其“关联实体”。 “关联实体”是指,对“本许可证”下的一方而言,控制、受控制或与其共同受控制的机构,此处的控制是指有受控方或共同受控方至少50%直接或间接的投票权、资金或其他有价证券。 “贡献”是指由任一“贡献者”许可在“本许可证”下的受版权法保护的作品。 1. 授予版权许可 每个“贡献者”根据“本许可证”授予您永久性的、全球性的、免费的、非独占的、不可撤销的版权许可,您可以复制、使用、修改、分发其“贡献”,不论修改与否。 2. 授予专利许可 每个“贡献者”根据“本许可证”授予您永久性的、全球性的、免费的、非独占的、不可撤销的(根据本条规定撤销除外)专利许可,供您制造、委托制造、使用、许诺销售、销售、进口其“贡献”或以其他方式转移其“贡献”。前述专利许可仅限于“贡献者”现在或将来拥有或控制的其“贡献”本身或其“贡献”与许可“贡献”时的“软件”结合而将必然会侵犯的专利权利要求,不包括仅因您或他人修改“贡献”或其他结合而将必然会侵犯到的专利权利要求。如您或您的“关联实体”直接或间接地(包括通过代理、专利被许可人或受让人),就“软件”或其中的“贡献”对任何人发起专利侵权诉讼(包括反诉或交叉诉讼)或其他专利维权行动,指控其侵犯专利权,则“本许可证”授予您对“软件”的专利许可自您提起诉讼或发起维权行动之日终止。 3. 无商标许可 “本许可证”不提供对“贡献者”的商品名称、商标、服务标志或产品名称的商标许可,但您为满足第4条规定的声明义务而必须使用除外。 4. 分发限制 您可以在任何媒介中将“软件”以源程序形式或可执行形式重新分发,不论修改与否,但您必须向接收者提供“本许可证”的副本,并保留“软件”中的版权、商标、专利及免责声明。 5. 免责声明与责任限制 “软件”及其中的“贡献”在提供时不带任何明示或默示的担保。在任何情况下,“贡献者”或版权所有者不对任何人因使用“软件”或其中的“贡献”而引发的任何直接或间接损失承担责任,不论因何种原因导致或者基于何种法律理论,即使其曾被建议有此种损失的可能性。 条款结束。 如何将木兰宽松许可证,第1版,应用到您的软件 如果您希望将木兰宽松许可证,第1版,应用到您的新软件,为了方便接收者查阅,建议您完成如下三步: 1, 请您补充如下声明中的空白,包括软件名、软件的首次发表年份以及您作为版权人的名字; 2, 请您在软件包的一级目录下创建以“LICENSE”为名的文件,将整个许可证文本放入该文件中; 3, 请将如下声明文本放入每个源文件的头部注释中。 Copyright (c) 2019-2020 李文浩 Chinese Medicine Identification is licensed under the Mulan PSL v1. You can use this software according to the terms and conditions of the Mulan PSL v1. You may obtain a copy of Mulan PSL v1 at: http://license.coscl.org.cn/MulanPSL THIS SOFTWARE IS PROVIDED ON AN "AS IS" BASIS, WITHOUT WARRANTIES OF ANY KIND, EITHER EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO NON-INFRINGEMENT, MERCHANTABILITY OR FIT FOR A PARTICULAR PURPOSE. See the Mulan PSL v1 for more details. Mulan Permissive Software License,Version 1 Mulan Permissive Software License,Version 1 (Mulan PSL v1) August 2019 http://license.coscl.org.cn/MulanPSL Your reproduction, use, modification and distribution of the Software shall be subject to Mulan PSL v1 (this License) with following terms and conditions: 0. Definition Software means the program and related documents which are comprised of those Contribution and licensed under this License. Contributor means the Individual or Legal Entity who licenses its copyrightable work under this License. Legal Entity means the entity making a Contribution and all its Affiliates. Affiliates means entities that control, or are controlled by, or are under common control with a party to this License, ‘control’ means direct or indirect ownership of at least fifty percent (50%) of the voting power, capital or other securities of controlled or commonly controlled entity. Contribution means the copyrightable work licensed by a particular Contributor under this License. 1. Grant of Copyright License Subject to the terms and conditions of this License, each Contributor hereby grants to you a perpetual, worldwide, royalty-free, non-exclusive, irrevocable copyright license to reproduce, use, modify, or distribute its Contribution, with modification or not. 2. Grant of Patent License Subject to the terms and conditions of this License, each Contributor hereby grants to you a perpetual, worldwide, royalty-free, non-exclusive, irrevocable (except for revocation under this Section) patent license to make, have made, use, offer for sale, sell, import or otherwise transfer its Contribution where such patent license is only limited to the patent claims owned or controlled by such Contributor now or in future which will be necessarily infringed by its Contribution alone, or by combination of the Contribution with the Software to which the Contribution was contributed, excluding of any patent claims solely be infringed by your or others’ modification or other combinations. If you or your Affiliates directly or indirectly (including through an agent, patent licensee or assignee), institute patent litigation (including a cross claim or counterclaim in a litigation) or other patent enforcement activities against any individual or entity by alleging that the Software or any Contribution in it infringes patents, then any patent license granted to you under this License for the Software shall terminate as of the date such litigation or activity is filed or taken. 3. No Trademark License No trademark license is granted to use the trade names, trademarks, service marks, or product names of Contributor, except as required to fulfill notice requirements in section 4. 4. Distribution Restriction You may distribute the Software in any medium with or without modification, whether in source or executable forms, provided that you provide recipients with a copy of this License and retain copyright, patent, trademark and disclaimer statements in the Software. 5. Disclaimer of Warranty and Limitation of Liability The Software and Contribution in it are provided without warranties of any kind, either express or implied. In no event shall any Contributor or copyright holder be liable to you for any damages, including, but not limited to any direct, or indirect, special or consequential damages arising from your use or inability to use the Software or the Contribution in it, no matter how it’s caused or based on which legal theory, even if advised of the possibility of such damages. End of the Terms and Conditions How to apply the Mulan Permissive Software License,Version 1 (Mulan PSL v1) to your software To apply the Mulan PSL v1 to your work, for easy identification by recipients, you are suggested to complete following three steps: i. Fill in the blanks in following statement, including insert your software name, the year of the first publication of your software, and your name identified as the copyright owner; ii. Create a file named “LICENSE” which contains the whole context of this License in the first directory of your software package; iii. Attach the statement to the appropriate annotated syntax at the beginning of each source file. Copyright (c) 2019-2020 李文浩 Chinese Medicine Identification is licensed under the Mulan PSL v1. You can use this software according to the terms and conditions of the Mulan PSL v1. You may obtain a copy of Mulan PSL v1 at: http://license.coscl.org.cn/MulanPSL THIS SOFTWARE IS PROVIDED ON AN "AS IS" BASIS, WITHOUT WARRANTIES OF ANY KIND, EITHER EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO NON-INFRINGEMENT, MERCHANTABILITY OR FIT FOR A PARTICULAR PURPOSE. See the Mulan PSL v1 for more details.

简介

中药识别APP服务器端 展开 收起
Java
MulanPSL-1.0
取消

发行版

暂无发行版

贡献者

全部

近期动态

加载更多
不能加载更多了
Java
1
https://gitee.com/php_java/chinese-medicine-identification-admin.git
git@gitee.com:php_java/chinese-medicine-identification-admin.git
php_java
chinese-medicine-identification-admin
中药图片拍照识别系统-后端
master

搜索帮助

14c37bed 8189591 565d56ea 8189591