| 棉花编号 Cottonnumber | D/m | Z1/m | Z2/m | Z3/m | Z4/m | Z5/m | Z6/m | Z7/m | Z8/m | S/m |
|---|---|---|---|---|---|---|---|---|---|---|
| 1 | (0.823, 0.857) | 0.855 | 0.852 | 0.849 | 0.853 | 0.849 | 0.853 | 0.847 | 0.847 | 0.00302 |
| 2 | (0.829, 0.866) | 0.845 | 0.843 | 0.847 | 0.845 | 0.853 | 0.857 | 0.844 | 0.849 | 0.00488 |
| 3 | (0.842, 0.879) | 0.870 | 0.871 | 0.867 | 0.871 | 0.864 | 0.868 | 0.871 | 0.865 | 0.00283 |
| 4 | (0.787, 0.821) | 0.818 | 0.820 | 0.814 | 0.819 | 0.799 | 0.797 | 0.814 | 0.811 | 0.00886 |
| 5 | (0.836, 0.873) | 0.844 | 0.850 | 0.852 | 0.844 | 0.844 | 0.847 | 0.851 | 0.853 | 0.00383 |
| 6 | (0.822, 0.868) | 0.837 | 0.842 | 0.849 | 0.836 | 0.846 | 0.839 | 0.825 | 0.844 | 0.00744 |
| 7 | (0.807, 0.839) | 0.832 | 0.833 | 0.829 | 0.833 | 0.830 | 0.835 | 0.830 | 0.832 | 0.00198 |
Fig. 3 Recognition module network structure in this study Focus is the slicing operation module for the input image; C3 is the concentrated comprehensive convolution block; CBS is the convolution unit; Concat indicates feature stitching; Upsample is feature upsampling; Detect is the detection module; Dark is DarkNet, and Dark+CA is the CA module added after the Dark module.
Fig. 4 The structure of CA module C denotes the depth of the feature channel; H and W denote the height and width of the feature; r is the indentation ratio; X Avg Pool denotes horizontal global pooling; Y Avg Pool denotes vertical global pooling; Conv2d denotes convolutional 2D; BN denotes batch normalization; Sigmoid is an activation function
Fig. 5 Prediction box and ground truth box of the model $b_{c_{x}}^{g t}$, $b_{c_{y}}^{g t}$ are the coordinates of the center point Bgt of the real box (ground truth box), and $b_{c_{x}}$, $b_{c_{y}}$ are the coordinates of the center point B of the prediction box; h and w are the length and width of the prediction box, and hgt, wgt are the length and width of the real box; σ is the straight line distance between the point Bgt and the point B; ch and cw are the distance between the center point of the two boxes in the vertical direction and the distance in the horizontal direction; ch and cw are the height and width of the smallest outer rectangle of the real box and the prediction box; α is the angle between the center point of the real box and the center point of the prediction box in the horizontal direction.
Fig. 7 Schematic diagram of the disparity calculation The figure uses two directional axes, Z and X; Oleft and Oright are the left and right camera optical centers, respectively; P is the positioning point on the object to be measured; Pleft and Pright are the imaging points of the positioning point P on the left and right camera optical sensors, respectively; xleft and xright are the distances of Pleft and Pright from the optical axis of the left and right cameras, respectively; f is the camera focal length; b is the distance between the centers of the two cameras; x is the coordinate of point P on the X-axis; Baseline is the line connecting the optical centers of the left and right cameras.
Fig. 10 Field cotton category recognition results based on YOLOX-Cotton Cotton_Side represents the side cotton identified based on YOLOX-Cotton model; Cotton_Top represents the positive normal cotton; CottonLow_Top represents the positive side of low-grade cotton. The number at the end of the cotton grade represents confidence.