Keywords

1 Introduction

One of the tasks of human engineering is to measure the structure or movement of a person. To measure an object’s structure or movement parameters is important work in computer vision and photogrammetry. When the measurement system is made up of many cameras, the three-dimensional stance and position can be obtained by having the cameras intersect at no fewer than three characteristic points on the target’s surface [1, 2]. The three-dimensional stance and position can be calculated when at least three characteristic points in the body’s coordinate system are known. Three-dimensional structure and movement can be rebuilt on the basis of single camera sequence images when the target has more disoperative characteristic points [3,4,5]. Many methods have been applied to measure the three-dimensional stance of targets such as missiles, including the axis method, the ellipticity method, the length–width ratio method, and the spiral method [6, 7].

This article elaborates on a method of monocular measurement for a walker, which is a translation-only one-dimensional target. The approach can be used to measure the 3-D pose and position parameters of the walker by using two endpoints. Then, any other point’s position on the target can be calculated. If there is no known distance between any two endpoints, a coefficient of proportionality will be left for the results. The movement parameters of the walker can be measured in this way when the distance between two points, the walker’s vertex and heel, is known, and thus, the true result is obtained.

2 Measurement Method

Adopting a center perspective projection-imaging model. Take the camera reference as a reference coordinates, so the camera lens is the origin and camera’s rotating matrix R is identity matrix. \( {\mathbf{M}} = \left[ {\begin{array}{*{20}c} X & Y & Z \\ \end{array} } \right]^{\text{T}} \), whose augmented matrix is \( \widetilde{{\mathbf{M}}} = \left[ {\begin{array}{*{20}c} X & Y & Z & 1 \\ \end{array} } \right]^{\text{T}} \), is the target’s coordinate in the reference coordinates and \( {\mathbf{m}} = \left[ {\begin{array}{*{20}c} u & v \\ \end{array} } \right]^{\text{T}} \), whose augmented matrix is \( {\tilde{\mathbf{m}}} = \left[ {\begin{array}{*{20}c} u & v & 1 \\ \end{array} } \right]^{\text{T}} \), is the target’s image coordinate. The imaging relation is:

$$ \begin{aligned} s{\tilde{\mathbf{m}}} & = {\mathbf{A}}\left[ {\begin{array}{*{20}c} {\mathbf{R}} & {\mathbf{T}} \\ \end{array} } \right]\widetilde{{\mathbf{M}}} \\ {\text{with}}\quad {\mathbf{A}} & = \left[ {\begin{array}{*{20}c} \alpha & \gamma & {u_{0} } \\ 0 & \beta & {v_{0} } \\ 0 & 0 & 1 \\ \end{array} } \right],\quad {\mathbf{R}} = \left[ {\begin{array}{*{20}c} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \\ \end{array} } \right],\quad {\mathbf{T}} = \left[ {\begin{array}{*{20}c} 0 \\ 0 \\ 0 \\ \end{array} } \right] \\ \end{aligned} $$
(1)

where \( \left( {u_{0} ,v_{0} } \right) \) is the main spot coordinate of the image, \( \alpha \) and \( \beta \) are the transverse and lengthways imaging equivalent focal lengths. \( \gamma \) is image reverse factor.

As shown in Fig. 1, two end points on a one-dimensional target are considered. And, the target is translation-only. The camera coordinate S-XYZ is the reference coordinate, and the image plane is I. The target’s characteristic points are A i and B i at time i (i = 0, 1, …, m − 1). The distance from A i to B i is L. The image points corresponding to the characteristic points are a i and b i .

Fig. 1
figure 1

Imaging relation of a translation-only one-dimensional target

Suppose the positions of the target’s characteristic points A and B are \( {\mathbf{M}}_{A0} = \left[ {\begin{array}{*{20}c} {X_{A0} } & {Y_{A0} } & {Z_{A0} } \\ \end{array} } \right]^{\text{T}} \) and \( {\mathbf{M}}_{B0} = \left[ {\begin{array}{*{20}c} {X_{B0} } & {Y_{B0} } & {Z_{B0} } \\ \end{array} } \right]^{\text{T}} \) at time 0, and that their positions at time i are \( {\mathbf{M}}_{Ai} = \left[ {\begin{array}{*{20}c} {X_{Ai} } & {Y_{Ai} } & {Z_{Ai} } \\ \end{array} } \right]^{\text{T}} \) and \( {\mathbf{M}}_{Bi} = \left[ {\begin{array}{*{20}c} {X_{Bi} } & {Y_{Bi} } & {Z_{Bi} } \\ \end{array} } \right]^{\text{T}} \). The translation vector of the position at time i corresponding to the position at time 0 is \( {\mathbf{T}}_{i} = \left[ {\begin{array}{*{20}c} {T_{Xi} } & {T_{Yi} } & {T_{Zi} } \\ \end{array} } \right]^{\text{T}} \). Thus

$$ \left\{ {\begin{array}{*{20}l} {{\mathbf{M}}_{Ai} = {\mathbf{M}}_{A0} + {\mathbf{T}}_{i} } \hfill \\ {{\mathbf{M}}_{Bi} = {\mathbf{M}}_{B0} + {\mathbf{T}}_{i} } \hfill \\ \end{array} } \right.\quad \left( {i = 1,2, \ldots, m - 1} \right) $$
(2)

Suppose the image coordinates of A and B are \( {\mathbf{m}}_{Ai} = \left[ {\begin{array}{*{20}c} {u_{Ai} } & {v_{Ai} } \\ \end{array} } \right]^{\text{T}} \) and \( {\mathbf{m}}_{Bi} = \left[ {\begin{array}{*{20}c} {u_{Bi} } & {v_{Bi} } \\ \end{array} } \right]^{\text{T}} \) at time i.

According to Formula (2), the imaging relation of the target’s characteristic points A and B can then be expressed as:

$$ \left\{ {\begin{array}{*{20}l} {s_{A0} {\tilde{\mathbf{m}}}_{A0} = {\mathbf{A}}\left[ {\begin{array}{*{20}c} {\mathbf{R}} & {\mathbf{T}} \\ \end{array} } \right]\widetilde{{\mathbf{M}}}_{A0} } \hfill \\ {s_{B0} {\tilde{\mathbf{m}}}_{B0} = {\mathbf{A}}\left[ {\begin{array}{*{20}c} {\mathbf{R}} & {\mathbf{T}} \\ \end{array} } \right]\widetilde{{\mathbf{M}}}_{B0} } \hfill \\ {s_{Ai} {\tilde{\mathbf{m}}}_{Ai} = {\mathbf{A}}\left[ {\begin{array}{*{20}c} {\mathbf{R}} & {\mathbf{T}} \\ \end{array} } \right]\left( {\widetilde{{\mathbf{M}}}_{A0} + \widetilde{{\mathbf{T}}}_{i} } \right)\quad \left( {i = 1,2, \ldots ,m - 1} \right)} \hfill \\ {s_{Bi} {\tilde{\mathbf{m}}}_{Bi} = {\mathbf{A}}\left[ {\begin{array}{*{20}c} {\mathbf{R}} & {\mathbf{T}} \\ \end{array} } \right]\left( {\widetilde{{\mathbf{M}}}_{B0} + \widetilde{{\mathbf{T}}}_{i} } \right)} \hfill \\ \end{array} } \right. $$
(3)

\( {\tilde{\mathbf{m}}}_{A0} , {\tilde{\mathbf{m}}}_{B0} , {\tilde{\mathbf{m}}}_{Ai} , {\tilde{\mathbf{m}}}_{Bi} , \widetilde{{\mathbf{M}}}_{A0} , \widetilde{{\mathbf{M}}}_{B0} \) and \( \widetilde{{\mathbf{T}}}_{i} \) are augment matrix of \( {\mathbf{m}}_{A0} \text{, }{\mathbf{m}}_{B0} \text{, }{\mathbf{m}}_{Ai} \text{, }{\mathbf{m}}_{Bi} \text{, }{\mathbf{M}}_{A0} \text{, }{\mathbf{M}}_{B0} \). \( s_{A0} , s_{B0} , s_{Ai} , s_{Bi} \) is the scale. \( {\mathbf{A}} \) is the inner parameters of camera and \( {\mathbf{R}} \) is the 3 × 3 identify matrix, \( {\mathbf{T}} \) is 3 × 1 zero vector.

Take the middle variables as:

$$ \left\{ {\begin{array}{*{20}c} {\begin{array}{*{20}l} {g_{0} = {{X_{A0} } \mathord{\left/ {\vphantom {{X_{A0} } {Z_{A0} }}} \right. \kern-0pt} {Z_{A0} }},g_{1} = {{Y_{A0} } \mathord{\left/ {\vphantom {{Y_{A0} } {Z_{A0} }}} \right. \kern-0pt} {Z_{A0} }}} \hfill \\ {g_{2} = {{X_{B0} } \mathord{\left/ {\vphantom {{X_{B0} } {Z_{A0} }}} \right. \kern-0pt} {Z_{A0} }},g_{3} = {{Y_{B0} } \mathord{\left/ {\vphantom {{Y_{B0} } {Z_{A0} }}} \right. \kern-0pt} {Z_{A0} }},g_{4} = {{Z_{B0} } \mathord{\left/ {\vphantom {{Z_{B0} } {Z_{A0} }}} \right. \kern-0pt} {Z_{A0} }}} \hfill \\ {g_{5,i} = {{T_{Xi} } \mathord{\left/ {\vphantom {{T_{Xi} } {Z_{A0} }}} \right. \kern-0pt} {Z_{A0} }},g_{6,i} = {{T_{Yi} } \mathord{\left/ {\vphantom {{T_{Yi} } {Z_{A0} }}} \right. \kern-0pt} {Z_{A0} }},g_{7,i} = {{T_{Zi} } \mathord{\left/ {\vphantom {{T_{Zi} } {Z_{A0} }}} \right. \kern-0pt} {Z_{A0} }}} \hfill \\ \end{array} } & {\left( {i = 1,2, \ldots ,m - 1} \right)} \\ \end{array} } \right. $$
(4)

So we can set up the linear equation related to g 0, g 1, g 2, g 3, g 4 and g 5,i , g 6,i , g 7,i . Suppose the distance between A and B is L, then

$$ \left( {X_{B0} - X_{A0} } \right)^{2} + \left( {Y_{B0} - Y_{A0} } \right)^{2} + \left( {Z_{B0} - Z_{A0} } \right)^{2} = L^{2} $$
(5)

Therefore,

$$ Z_{A0} = {L \mathord{\left/ {\vphantom {L {\sqrt {\left( {g_{2} - g_{0} } \right)^{2} + \left( {g_{3} - g_{1} } \right)^{2} + \left( {g_{4} - 1} \right)^{2} } }}} \right. \kern-0pt} {\sqrt {\left( {g_{2} - g_{0} } \right)^{2} + \left( {g_{3} - g_{1} } \right)^{2} + \left( {g_{4} - 1} \right)^{2} } }} $$
(6)

Furthermore, we can obtain results for other parameters

$$ \left\{ {\begin{array}{*{20}l} {X_{A0} = g_{0} Z_{A0} ,\,Y_{A0} = g_{1} Z_{A0} } \hfill \\ {X_{B0} = g_{2} Z_{A0} ,\,Y_{B0} = g_{3} Z_{A0} ,\,Z_{B0} = g_{4} Z_{A0} } \hfill \\ {T_{Xi} = g_{5,i} Z_{A0} ,\,T_{Yi} = g_{6,i} Z_{A0} ,\,T_{Zi} = g_{7,i} Z_{A0} } \hfill \\ \end{array} } \right.\quad \left( {i = 1,2, \ldots ,m - 1} \right) $$
(7)

When the exact distance between A and B is known, the positions of A and B at any point in time can be solved with precision (that is, the initial position and the translation vector at every time).

However, if the distance between A and B is not known, a coefficient of proportionality will exist between the measured result and the true result. In practice, if the distance between each target point and the camera is known, the true result can also be calculated. The pose of one-dimensional target can be solved if the characteristic points target coordinates are known [8]. When, we need not to know the scale information.

3 Experimental Validation

The target to be measured is a person walking steadily in front of the camera about 100 m. The medial axes of the person are taken as the one-dimensional target. The calvarias and the intersect point of the medial axes with the two touchdown points connect line are chosen as the starting point and the end point. The camera remains stationary on the floor and takes images, some of which are shown in Fig. 2. The small circle points are the projection on the image plane of position’s result of the end characteristic point of the walker. The big circle points are the current result. All of the results at different time are figured in every image. The triangle is the calculated result of pate and projects it to image plane. All the result’s projection can accordant with the current target’s pose and position.

Fig. 2
figure 2

Measurement of the position and stance of a person walking steadily and the projection of calculated results

Let the person stays at every measured position. At the same time of taking images, getting the 3-D coordinates depending on geosystems. On the reference coordinate’s XZ, the true trajectory and calculated trajectory are shown in Fig. 3, in which the real line represents the true trajectory and the broken line represents calculated trajectory.

Fig. 3
figure 3

Statistical error of measured result of the walker that the true result is known

4 Conclusion

This article elaborates on the theory of monocular measurement for a walker, a translation-only one-dimensional target, and uses practical experiments to validate this approach. By this approach, a single camera is employed to take at least two images and make use of at least two characteristic points on the target to measure the walker’s movement parameters. Provided the distance or another measurement between two characteristic points is known, the three-dimensional position and stance can be calculated. Except the two characteristic points, other point’s position on the target can also be calculated. If no measurement information is available, there will be a coefficient of proportionality between the measured result and the true result.