Caffe：使用lenet5模型训练自己的数据集

Caffe：使⽤lenet5模型训练⾃⼰的数据集

⼀、前⾔

⼆、训练模型之前的准备⼯作

（1）图像数据准备

由于主要是使⽤lenet模型训练⾃⼰的图⽚数据，我的图像数据共有10个类别，分别是0～9，相应地保存在名为0～9的⽂件夹，在/home/您的⽤户名/下新建⼀⽂件夹char_images，⽤于保存图像数据，在/home/您的⽤户名/char_images/下新建两个⽂件夹，名字分别为train和val，各⾃都包含了名为0～9的⽂件夹，例如⽂件夹0内存放的是字符”0”的图像，我的⽂件夹如下：

（2）对图像数据作统⼀缩放⾄28*28，并⽣成txt标签

为了计算均值⽂件，需要将所有图⽚缩放⾄统⼀的尺⼨，在train和val⽂件夹所在路径下创建python⽂件，命名getPath.py，并写⼊以下内容：

1. #coding:utf-8

3. import cv2

4. import os

6. def IsSubString( SubStrList , Str): #判断SubStrList的元素

7. flag = True #是否在Str内

8. for substr in SubStrList:

9. if not ( substr in Str):

10. flag = False

11.

12. return flag

13.

14. def GetFileList(FindPath,FlagStr=[]): #搜索⽬录下的⼦⽂件路径

15. FileList=[]

16. FileNames=os.listdir(FindPath)

17. if len(FileNames)>0:

18. for fn in FileNames:

19. if len(FlagStr)>0:

20. if IsSubString(FlagStr,fn): #不明⽩这⾥判断是为了啥

21. fullfilename=os.path.join(FindPath,fn)

22. FileList.append(fullfilename)

23. else:

24. fullfilename=os.path.join(FindPath,fn)

25. FileList.append(fullfilename)

26.狮龙音响

28. FileList.sort()

29.

30. return FileList

31.

32.

33. train_txt = open('' , 'w') #制作标签数据

34. classList =['0','1','2','3','4','5','6','7','8','9']

35. for idx in range(len(classList)) :

36. imgfile=GetFileList('train/'+ classList[idx])#将数据集放在与.py⽂件相同⽬录下

37. for img in imgfile:

38. srcImg = cv2.imread( img);

39. resizedImg = size(srcImg , (28,28))

40. cv2.imwrite( img ,resizedImg)

41. strTemp=img+' '+classList[idx]+'\n'#⽤空格代替转义字符 \t

42. train_txt.writelines(strTemp)

43. train_txt.close()

44.

45.

46. test_txt = open('' , 'w') #制作标签数据

47. for idx in range(len(classList)) :

48. imgfile=GetFileList('val/'+ classList[idx])

49. for img in imgfile:

50. srcImg = cv2.imread( img);

51. resizedImg = size(srcImg , (28,28))

52. cv2.imwrite( img ,resizedImg)

53. strTemp=img+' '+classList[idx]+'\n'#⽤空格代替转义字符 \t

54. test_txt.writelines(strTemp)

55. test_txt.close()

56.

57. print("成功⽣成⽂件列表")

运⾏该py⽂件，可将所有图⽚缩放⾄28*28⼤⼩，并且在rain和val⽂件夹所在路径下⽣成训练和测试图像数据的标签txt⽂件，⽂件内容为：

(3)⽣成lmdb格式的数据集

⾸先于caffe路径下新建⼀⽂件夹My_File，并在My_File下新建两个⽂件夹Build_lmdb和Data_label，将(2)中⽣成⽂本⽂件和搬⾄Data_label 下

将caffe路径下 examples/imagenet/create_imagenet.sh 复制⼀份到Build_lmdb⽂件夹下

打开create_imagenet.sh ，并做如下修改：

1. #!/usr/bin/env sh

2. # Create the imagenet lmdb inputs

3. # N.B. set the path to the imagenet train + val data dirs

4. set -e

6. EXAMPLE=/home/你的⽤户名/caffe/My_File/Build_lmdb #⽣成的lmdb格式数据保存地址

7. DATA=/home/你的⽤户名/caffe/My_File/Data_label #两个txt标签⽂件所在路径

8. TOOLS=/home/你的⽤户名/caffe/build/tools #caffe⾃带⼯具，不⽤管，最好加绝对路径

10. TRAIN_DATA_ROOT=/home/你的⽤户名/char_images/ #预先准备的训练图⽚路径，该路径和上写的路径合起来是图⽚完整路径

11. VAL_DATA_ROOT=/home/你的⽤户名/char_images/ #预先准备的测试图⽚路径，...

12.

13. # Set RESIZE=true to resize the images to 256x256. Leave as false if images have

14. # already been resized using another tool.

15. RESIZE=false #因为在这之前已经将图像归⼀化为28*28，所以设置为false

16. if $RESIZE; then

17. RESIZE_HEIGHT=28

18. RESIZE_WIDTH=28

19. else

20. RESIZE_HEIGHT=0

21. RESIZE_WIDTH=0

22. fi

23.

24. if [ ! -d "$TRAIN_DATA_ROOT" ]; then

25. echo "Error: TRAIN_DATA_ROOT is not a path to a directory: $TRAIN_DATA_ROOT"

26. echo "Set the TRAIN_DATA_ROOT variable in create_imagenet.sh to the path" \

27. "where the ImageNet training data is stored."

28. exit 1

29. fi

30.

31. if [ ! -d "$VAL_DATA_ROOT" ]; then

32. echo "Error: VAL_DATA_ROOT is not a path to a directory: $VAL_DATA_ROOT"

33. echo "Set the VAL_DATA_ROOT variable in create_imagenet.sh to the path" \

34. "where the ImageNet validation data is stored."

35. exit 1

36. fi

37.

38. echo "Creating "

39.

40. GLOG_logtostderr=1 $TOOLS/convert_imageset \

41. --resize_height=$RESIZE_HEIGHT \

42. --resize_width=$RESIZE_WIDTH \

44. --gray \ #灰度图像加上这个

45. $TRAIN_DATA_ROOT \

46. $ \

47. $EXAMPLE/train_lmdb #⽣成的lmdb格式训练数据集所在的⽂件夹

48.

49. echo "Creating "

50.

51. GLOG_logtostderr=1 $TOOLS/convert_imageset \

52. --resize_height=$RESIZE_HEIGHT \

53. --resize_width=$RESIZE_WIDTH \

54. --shuffle \

55. --gray \ #灰度图像加上这个

56. $VAL_DATA_ROOT \

57. $ \

58. $EXAMPLE/val_lmdb #⽣成的lmdb格式训练数据集所在的⽂件夹

液基细胞学59.

60. echo "Done."

以上只是为了说明修改的地⽅才添加汉字注释，实际时sh⽂件不要出现汉字，运⾏该sh⽂件，可在Build_lmdb⽂件夹内⽣成2个⽂件夹train_lmdb和val_lmdb，⾥⾯各有2个lmdb格式的⽂件

(4)更改lenet_solver.prototxt和lenet_train_test.prototxt

将caffe/examples/mnist下的 train_lenet.sh 、lenet_solver.prototxt 、lenet_train_test.prototxt 这三个⽂件复制⾄ My_File，⾸先修改train_lenet.sh 如下，只改了solver.prototxt的路径

1. #!/usr/bin/env sh

2. set -e

病毒唑注射液

4. ./build/tools/caffe train --solver=My_File/lenet_solver.prototxt $@ #改路径

然后再更改lenet_solver.prototxt，如下：

1. # The train/test net protocol buffer definition

2. net: "My_File/lenet_train_test.prototxt"#改这⾥

3. # test_iter specifies how many forward passes the test should carry out.

4. # In the case of MNIST, we have test batch size 100 and 100 test iterations,

5. # covering the full 10,000 testing images.

6. test_iter: 100

7. # Carry out testing every 500 training iterations.

8. test_interval: 500

9. # The base learning rate, momentum and the weight decay of the network.

10. base_lr: 0.01

11. momentum: 0.9

12. weight_decay: 0.0005

13. # The learning rate policy

14. lr_policy: "inv"

15. gamma: 0.0001

16. power: 0.75

17. # Display every 100 iterations

18. display: 100

19. # The maximum number of iterations

21. # snapshot intermediate results

22. snapshot: 5000

23. snapshot_prefix: "My_File/"#改这⾥

24. # solver mode: CPU or GPU

25. solver_mode: GPU （/CPU）

最后修改lenet_train_test.prototxt ，如下：

1. name: "LeNet"

2. layer {

空间

3. name: "mnist"

4. type: "Data"

5. top: "data"

6. top: "label"

7. include {

8. phase: TRAIN

9. }

10. transform_param {

11. scale: 0.00390625

12. }

13. data_param {

朵康14. source: "My_File/Build_lmdb/train_lmdb"#改成⾃⼰的

15. batch_size: 64

16. backend: LMDB

17. }

18. }

19. layer {

20. name: "mnist"

21. type: "Data"

22. top: "data"

23. top: "label"

24. include {

25. phase: TEST

神龙赋

26. }

27. transform_param {

28. scale: 0.00390625

29. }

30. data_param {

31. source: "My_File/Build_lmdb/val_lmdb"#改成⾃⼰的

32. batch_size: 100

33. backend: LMDB

34. }

35. }

36. layer {

37. name: "conv1"

38. type: "Convolution"

39. bottom: "data"

40. top: "conv1"

41. param {

42. lr_mult: 1

43. }

44. param {

45. lr_mult: 2

46. }

47. convolution_param {

本文发布于:2024-09-24 18:16:49，感谢您对本站的认可！

本文链接：https://www.17tex.com/xueshu/453330.html

上一篇：基于灰模型的端口短信预测和垃圾短信治理研究

下一篇：VGG16模型图片处理

标签：数据图像件夹

留言与评论（共有 0 条评论）