Keywords

1 Introduction

With the development of the times, the rapid innovation of technology in human life has prompted humans to seek to liberate themselves from the environment [2]. It also helps humans enhance the ability to adapt to the environment with the help of external devices. Therefore, various human-computer interaction technologies have emerged. Human-computer interaction technology occupies an important position in human society because it plays a channel of information exchange between humans and machines.

In human-machine systems, our goals include safety, efficiency, comfort, etc. [1] therefore, how to improve the interaction efficiency and reduce operational errors in human-machine interaction has become one of the important research topics. Ergonomics is just a science that studies the interaction between human, machine and environment and its reasonable combination. And its goal is to make the designed machine and environment system fit for people’s physiological and psychological characteristics, in addition, improve efficiency, safety, health and comfort in production. In this paper, our research object is the cockpit of a car. The final purpose is to make a usability evaluation of the car cockpit and this evaluation can help companies produce more human-friendly car cockpits.

In order to realize the usability evaluation, we need to explore how does the driver’s driving performance, behavioral response and physiological state change when he encounters different traffic conditions. The driving performance is evaluated by the binocular camera system. Behavioral response and physiological state of the driver are obtained by the heart rate and respiration rate measuring equipment and head-mounted eye tracking system.

The focus and innovation of our work are on image stereo matching whose result is used to obtain quite accurate object distance estimation under different task scenarios. Currently, image stereo matching algorithms are divided into two categories [3]: blocking matching algorithm [8] based on grayscale and feature point matching algorithm [5]. Compared with the two kinds of methods, the feature point matching algorithm has the advantages of small computation, good robustness, and insensitivity to image deformation [4], so in this paper, the method we used is just the feature point matching algorithm. Feature point matching algorithm mainly includes three steps: feature extraction, feature description and feature matching [9]. It first extracts the features of the image, regenerates them into feature descriptors, and finally matches the features of the two images according to the similarity of the descriptors [7].

2 Method

In order to realize the usability evaluation, we need to explore how does the driver’s driving performance, behavioral response and physiological state change when he encounters different traffic conditions. After completing the experiment, in each group, we can get such experimental data: a group of images taken by a binocular camera during the experiment, drivers’ heart rate, eye movement track characteristics, and the speed of the car from GPS. The processing of these data is shown as follow:

2.1 Image Stereo Calibration and Rectification

In the process of image measurement and machine vision applications, the process of solving parameters (internal parameters, external parameters) is called camera calibration. Through Zhang [10] we can get the relationship between the coordinates of a point in space in the world coordinate system and the coordinates in the pixel coordinate system are as follows:

$$ s\left[ {\begin{array}{*{20}c} u \\ v \\ 1 \\ \end{array} } \right] = K_{3 \times 4} \cdot \left[ {\begin{array}{*{20}c} R & T \\ O & 1 \\ \end{array} } \right] \cdot \left[ {\begin{array}{*{20}c} {\begin{array}{*{20}c} {X_{w} } \\ {Y_{w} } \\ \end{array} } \\ {Z_{w} } \\ 1 \\ \end{array} } \right] $$
(1)

Where \( \left( {u,v} \right) \) is the coordinate in the pixel coordinate system, \( \left( {X_{w} ,Y_{w} ,Z_{w} } \right) \) is the coordinate in the world coordinate system, \( K_{3 \times 4} \) is the internal parameter matrix, it has five internal parameters; \( \left[ {\begin{array}{*{20}c} R & T \\ O & 1 \\ \end{array} } \right] \) is the external parameter matrix, and \( s = 1/Z_{c} \) is an unknown scale factor.

A perfectly aligned configuration is rare with a real stereo system, since the two cameras almost never have exactly coplanar, row-aligned imaging planes. And the goal of stereo rectification [16] is projecting the image planes of our two cameras so that they reside in the exact same plane, with image rows perfectly aligned into a frontal parallel configuration.

2.2 Disparity Calculation

We know that the object has obvious color changes at the boundary [13], so we choose the feature of gradient to select the feature point. First, calculate the gradient in the horizontal direction using the following formula:

$$ {\text{g}}rad\left( {x,y} \right) = Gray\left( {x + 1,y} \right) - Gray\left( {x - 1,y} \right) $$
(2)

After calculating the gradient values of all pixels in the image, a threshold value τ is set, and all points with gradient values greater than the threshold value are marked as feature points.

After obtaining the feature points of the left and right images, the next step is to match the feature points. According to the positional relationship between the two cameras, we know that the position of any spatial point in the left camera image is always to the right of the position in the right camera image [14]. The cost function that reflects the similarity of the two feature points contains two parts. One is the sum of the absolute values of the corresponding gradient differences in the \( 1*5 \) pixel interval with the feature points as the center; the other is the sum of the absolute values of the corresponding gray level differences in the \( 1*5 \) pixel interval with the feature points as the center. Formulated as:

(3)

Where \( i \) refers to the position of the feature point on the left camera image, \( i_{l} \) refers to the position of the feature point on the right camera image with a relative disparity of \( l \), \( \tau_{1} \) and \( \tau_{2} \) are two truncated data. When the cost function reaches the minimum the feature point is the matching point. After feature point matching, the disparity \( d \) of these feature point pairs can be obtained.

2.3 Point Cloud Generating and Refinement

According to the principle of small hole imaging:

$$ \frac{f}{Z} = \frac{d}{T} = \frac{x}{X} = \frac{y}{Y} $$
(4)

Where \( f \) is the focal length, \( T \) is the distance between the two optical centers, and they can be obtained after Camera Calibration, \( d \) is the disparity, \( \left( {x,y} \right) \) is the feature point’s coordinate in the left camera image coordinate system, \( \left( {X,Y,Z} \right) \) is the three-dimensional actual coordinate of the feature point in the camera coordinate system, and so you can get the actual coordinate as:

$$ \left\{ {\begin{array}{*{20}c} {X = \frac{T \cdot x}{d}} \\ {Y = \frac{T \cdot y}{d}} \\ {Z = \frac{T \cdot f}{d}} \\ \end{array} } \right. $$
(5)

A sparse point cloud of feature points can be generated (see Fig. 1)

Fig. 1.
figure 1

Sparse point cloud of feature points (left image is a 3D view; right image is a top view)

From the point cloud, we can find two problems: the point cloud presents a distinct layered structure, and there are a lot of mismatches during the matching process.

The reason for point cloud layering is that the disparity value obtained by the above matching algorithm is the integer pixel accuracy. In order to obtain higher subpixel accuracy, further subpixel refinement of the disparity value is required.

Mismatches seriously affect target recognition [15], so we remove mismatches in three steps:

  1. 1.

    Filtering: we noticed that the point cloud distribution of the feature points on the target is often dense, and the mismatched feature points will be scattered to the unknown space area. Under this condition, we can remove the sparse feature point set based on the density of the point cloud.

  2. 2.

    Feature points enhancement: when the continuous feature point set is too dense, it will have a certain degree of impact on the subsequent matching results. Therefore, we can enhance a feature point set where the feature points are dense to a feature point.

  3. 3.

    Sorting feature points: feature point matching is to find the matching points of the feature points on the left image along the polar direction, and the feature points on the left image have the order of the coordinates. Therefore, the coordinates of the matching points in the right image obtained via the matching should also be ordered. With this condition, we can remove some matching points that do not meet the ordering.

2.4 Clustering

After removing the mismatch points on the point cloud, we can obtain relatively accurate sparse point cloud results. The next thing to do is to segment each target in the point cloud. As can be seen from the above point cloud, the points where the target is located is often concentrated, so the clustering algorithm can be used to complete the target detection and segmentation.

Commonly used clustering algorithms are K-means algorithm [18], DBSCAN algorithm [17], etc. The K-means algorithm requires knowing the number of data containing classes in advance, which obviously cannot be applied to our scenario. The DBSCAN Algorithm is a density-based clustering algorithm that fits well with the characteristics of our point cloud data and does not require the number of classes contained in the data to be known in advance. Therefore, we use the DBSCAN clustering algorithm. Figure 2 shows the clustering result.

Fig. 2.
figure 2

The clustering result (left image is a 3D view, right image is a top view)

2.5 Timestamp Synchronization

In the experiment, we use distributed acquisition and timestamp synchronization technology. So we can configure flexibly and avoid system crash caused by single device failure. For the sampling timestamp got after clock synchronization, we should match the timestamp of each data file and combine them into one file. Based on a reasonable preset frequency \( f_{p} \), perform downsampling on data with a frequency higher than \( f_{p} \), and interpolation on data with a frequency lower than \( f_{p} \).

3 Experiment

3.1 Experiment Design

In our comparative experiments, the driver should accomplish tasks on different experimental scenarios to induce different behaviors and state of drivers. We divide experimental tasks into primary tasks and secondary tasks. There are three sets of experiments. The primary task of experiment one is maintaining the same distance from the car in front (Distance Control), the primary task of experiment two is maintaining the same speed while driving on straight road (Speed Control), and the primary task of experiment three is maintaining the same speed while driving on winding road (Speed and Direction Control). Three experiments have the same secondary tasks, they are none, manual control of air-conditioning temperature and voice-control of air-conditioning temperature.

3.2 Participants and Apparatus

In this experiment, the main equipment we need includes one binocular camera for images collecting, some portable computers, one heart rate measuring equipment, one head-mounted eye tracking equipment and a car mounted GPS (see Fig. 3).

Fig. 3.
figure 3

Apparatus needed in this experiment

We invite three drivers to participate in this experiment. They all study or work at Shanghai Jiao Tong University. And they all have good vision and driver’s licenses.

3.3 Procedure and Data Collection

The whole experiment procedure is as follows:

  1. 1.

    Equip all equipment: installing the binocular camera, GPS module, driving recorder; wearing the heart rate watch and eye tracker for driver.

  2. 2.

    Synchronize time: synchronize the system time of all computers to this time server via the local area network.

  3. 3.

    Start the data acquisition program: after the driver is informed of the experimental process and is ready, starting the acquisition programs of all devices.

  4. 4.

    Start experiment one: the primary task is maintaining distance from the car in front. Meanwhile, perform the following secondary tasks sequentially.

    1. a.

      None.

    2. b.

      Manually adjust the temperature of the air conditioner several times.

    3. c.

      Adjust the temperature of the air conditioner several times using voice.

  5. 5.

    Start experiment two: the primary task is maintaining speed while driving on a straight road at 20 km/h. Meanwhile, perform the following secondary tasks sequentially.

    1. a.

      None.

    2. b.

      Manually adjust the temperature of the air conditioner several times.

    3. c.

      Adjust the temperature of the air conditioner several times using voice.

  6. 6.

    Start experiment three: the primary task is maintaining speed while driving on a winding road at 20 km/h. Meanwhile, perform the following secondary tasks sequentially.

    1. a.

      None.

    2. b.

      Manually adjust the temperature of the air conditioner several times.

    3. c.

      Adjust the temperature of the air conditioner several times using voice.

  7. 7.

    Terminate the data acquisition program: terminate all data acquisition programs and place all data in a folder named after the driver’s name and date.

  8. 8.

    Change driver: change the next driver and repeat the above steps.

4 Result

After completing the data processing according to the above method, in each group of experiments, we can get the driving performance via image processing and driving recorder. Plotting the heart and respiration rate curves over time can help us analyze drivers’ psychology state. Eye movement tracker can record the track characteristics of human eye movement when processing visual information. After processing these data, we can get the following results:

Figure 4 shows a comparison of driving performance under different primary task conditions.

Fig. 4.
figure 4

Car’s speed over time in Experiment Two(a) and Experiment Three(a)

Figure 5 shows a comparison of drivers’ psychology state under different primary task conditions.

Fig. 5.
figure 5

Driver’s heart rate over time in Experiment Two(a) and Experiment Three(a)

Figure 6 and Fig. 7 shows a comparison of driving performance under different secondary task conditions.

Fig. 6.
figure 6

Car’s speed over time in speed control task

Fig. 7.
figure 7

Car’s speed over time in speed and direction control task

Figure 8 and Fig. 9 shows how does the driving performance change in experiment one under secondary task conditions.

Fig. 8.
figure 8

Distance from the target over time in distance and manual control task

Fig. 9.
figure 9

Distance from the target over time in distance and voice control task

Figure 10 shows how car speed and eyes gazing area change over time in experiment two with manual control task.

Fig. 10.
figure 10

Car’s speed and eyes gazing area over time in speed and manual control

5 Discussion

As can be seen from Fig. 4, car speed in experiment 2 is closer to 20 km/h and it has less discrete. Calculation results show the average speed of the car in Experiment 2 is 19.4 km/h, and the standard deviation is 0.878; while the average speed of the car in Experiment 3 is 17.8 km/h, and the standard deviation is 1.527. Calculation results prove driving performance in experiment 2 is better, in other words, the driver’s driving performance is better at lower task difficulty, this result conforms to common sense in our cognition.

As can be seen roughly from Fig. 5, driver’s heart rate in experiment 2 is lower and smoother. Calculation results show the average heart rate of the driver in Experiment 2 is 83.9, and the standard deviation is 4.68; while the heart rate of the driver in Experiment 3 is 86.8, and the standard deviation is 9.02. Calculation results prove heart rate in experiment 2 is really lower and smoother, in other words, the driver is more relaxed at lower task difficulty, this result conforms to common sense in our cognition.

In Fig. 6 and Fig. 7, the secondary tasks of both experiment 2 and experiment 3 are none, manual control of airconditioning temperature and voice-control of air-conditioning temperature. Obviously, the driver needs to allocate more attention to complete the required tasks in experiment 2/3. We can think that these two sets of experiments were interrupted by secondary tasks. As can be seen from Fig. 6 and Fig. 7, car speed in experiment 2(a) with no secondary task is closer to 20 km/h and it has less discrete. Calculation results are shown in Table 1:

Table 1. Average speed and standard deviation in experiment 2/3

Calculation results prove driving performance in experiment(a) is better than experiment(b), (c), and it is more obvious in experiment 2 than experiment 3. So we can get the conclusion that the driver’s driving performance is better at no secondary tasks, but this comparison is less obvious when the primary task is more complicated. This result also conforms to common sense in our cognition.

In Fig. 8 and Fig. 9, the secondary tasks in experiment 1(b) and (c) are manual control of airconditioning temperature and voice-control of air-conditioning temperature. And the car’s distance from the target represents the drive performance. In Fig. 8, the secondary task is manual control, we can find that in the time period of 0–7 s in which the driver performs the manual control, the distance increases sharply from 20 m to 32.6 m; in the time period of 7–13 s with no secondary task, the distance decreases gradually; in the time period of 13–20 s with another manual control task, the distance decreases at the former period’s descent rate. We can realize when performing the manual control, the driver’s most attention is paid on completing the secondary task, so the drive performance would become extremely poor. In Fig. 9, the secondary task is voice-control, we can find in the performing period, the distance is constantly adjusted. Because when performing voice-control, driver doesn’t need to look at the touchpad and have more attention to complete the primary task compared with performing manual control. This comparison shows it is safer and more effective using voice-control instead of manual control.

In Fig. 10, the secondary task is manual control of airconditioning temperature. In the time period of 6–12 s and 22–28 s, driver performs manual control. We can find from both periods that when the task begins, the driver’s eyes gaze at the center touch screen to complete the manual control. And in the meanwhile, the car’s speed decreases rapidly because driver’s most attention is paid on secondary task. After finishing manual control, the car’s speed is corrected again.

Another result in experiment is that the average time taken in manual control tasks is 14.91 s, and the average time taken in voice control tasks is 7.83 s. Obviously, using voice control takes less time.

6 Conclusion

From all experiment results, we can get conclusions as follow:

  1. 1.

    The higher the difficulty of the primary task, the lower the driving performance, and driver’s emotions will be more intense.

  2. 2.

    When there is interference from secondary tasks, driving performance will also become lower.

  3. 3.

    Manual control tasks are more disruptive to driving performance than voice-control tasks.

  4. 4.

    Manual control tasks take more time than voice-control tasks.

  5. 5.

    While performing secondary tasks, driver’s attention paid on primary task would be reduced, so the drive performance would be lower.