Do you understand what you mean? Using Predictive Visualizations to Reduce Optimism in Duration Estimates (Do You See What You Mean? Using Predictive Visualizations to Reduce Optimism in Duration Estimates)

Original link: http://vis.pku.edu.cn/blog/%E4%BD%A0%E6%98%8E%E7%99%BD%E4%BD%A0%E7%9A%84%E6% 84%8F%E6%80%9D%E5%90%97%EF%BC%9F%E4%BD%BF%E7%94%A8%E9%A2%84%E6%B5%8B%E5%8F% AF%E8%A7%86%E5%8C%96%E9%99%8D%E4%BD%8E%E5%AF%B9%E6%97%B6%E9%97%B4%E4%BC%B0/

In everyday life, people tend to give an optimistic, lower-than-realistic estimate when estimating the duration of a task, even when they have relevant experience. This phenomenon is known in psychology as the Planning Fallacy. This paper proposes a hybrid approach to help people mitigate planning fallacy and give more accurate time estimates by combining existing methods for removing estimation bias and suitable visualization tools.

Figure 1 Quantile point map: reflects the probability distribution of task duration

First, the authors model the user’s perception of task duration. Here we draw on two existing methods for removing the bias caused by the planning fallacy: task decomposition and surprise lists. Task decomposition is to decompose the whole task into several independent subtasks, and obtain the estimation of the whole task by estimating the time of each subtask. Existing research shows that the sum of estimated times for subtasks is often larger than the direct estimates for the entire task, which can offset the underestimation brought about by the planning fallacy. The surprise list needs to list all the unexpected events that may affect the duration of the task, and give the possibility of such unexpected events. By listing unexpected events, users can give a more objective time estimate.

Then, according to the results of the above two debiasing methods, the probability distribution of the user’s belief about the task time is calculated (as shown in Figure 2). First, the user gives the upper and lower bounds of the estimated duration of each subtask and unexpected event. Then, it is assumed that the duration distribution conforms to the log-normal distribution, and the upper and lower bounds estimated by the user correspond to the 95% confidence interval. From this, each subtask can be drawn. Duration distribution of tasks and incidents. Then, considering whether each of the n unexpected events occurs, it can be divided into 2^n cases. In each case, the probability distribution of the sum of the duration is obtained. Finally, the weighted summation is performed according to the probability of each occurrence, and the duration probability of the entire task is obtained. distributed.

Because this distribution will become very complex with the increase of unexpected events, and the complex distribution will not help users make decisions well, but will cause confusion to users. Therefore, the Monte Carlo method is used here to sample the final distribution.

Figure 2 The process of calculating the probability distribution of users’ beliefs about task time

The next step is to choose a suitable visualization method for the probability distribution of task duration. The authors used a quantile dot plot (Quantile Dot Plot, Figure 1). In order to ensure that the sampling results are close to the original distribution, the number of samples is usually very large. In order to reduce the number of points in the point map, the binning method is used here, so that each point in the point map represents a binning of a sampling point. A 50-point chart with a relevant work surface can help make the best decisions, while a 20-point chart can minimize bad performance. So in the user study, the authors explored 20 and 50 point graphs respectively. In addition, in daily life, in addition to caring about the duration of the task, people often pay attention to the proper start to ensure that the task is completed before the deadline. In order to explore this aspect, the author gives a text feedback view, and Add a slider or line chart below the text that can interactively adjust the time. In a follow-up study, the authors compared two interactive ways of adjusting time.

Figure 3 Text feedback view

The user experiment is mainly divided into two parts. The first part is an estimate of the duration. Specifically, users need to answer how long it takes to go to the grocery store to buy three party supplies based on their actual situation. The second part is decision making, and the user needs to answer based on the first part when they should go to the grocery store in order to catch the next train. The specific process is as follows:

In the first part, the user needs to give an estimate of the overall duration after seeing the problem. After applying the two debiasing methods mentioned earlier, the user needs to give a second estimate of the overall duration. Subsequently, the user needs to look at the 20-point graph or the 50-point graph (depending on which the user is assigned), and then give his own third estimate. In the second part, the user makes an initial decision after reading the question. After that, the user can interactively try different trigger times and get the average time to wait for the train and the probability of missing the train from the text feedback, and then the author needs to make the final decision. Finally, considering that different users will have different trade-offs between waiting time and the probability of missing a train, users need to report whether they are often late or early.

Figure 4 User experiment process

The main results of the user experiment are as follows: In terms of improving the estimation time (as shown in Figure 5), the estimation results are divided into two aspects, one is the point estimation result (that is, the center point of the upper and lower bounds of the estimation), and the other is the uncertainty (that is, the prediction result). the size of the estimated interval). The average lift value for the estimated time is shown by black primitives. It can be seen that users generally have an improvement in the time estimate after seeing the visual view, which can be disguised as an indication that the user’s initial estimate is generally too optimistic, which is also in line with the definition of planning fallacy. In addition, the estimated improvement of the user after the debiasing method is smaller than the estimated improvement after using the visualization view. Since we believe that the improvement in estimation can counteract the effects of planning fallacy, it can be argued that visualization corrects planning fallacy somewhat better than debiasing methods. Likewise, after removing some extreme values ​​(whose corresponding lift averages are shown by the yellow plots), it can be inferred that the debiasing method is more efficient than the visualization for the uncertainty of lift estimates, but the opposite is true for point estimates .

Figure 5 Estimated time improvement

In terms of estimation accuracy (as shown in Figure 6), after the visualization is used, the coverage of the sampling points in the estimation interval is significantly improved, which is also in line with expectations. But interestingly, there is no noticeable difference between the 20-point and 50-point charts. This may be because the 20-point plot is sufficient to describe the overall distribution in this example.

Figure 6 Estimation accuracy

On the departure time decision, we can see a clear improvement in decision making after using text feedback (i.e. the probability of missing a train is significantly reduced, while the average waiting time is only slightly increased). Also, more interestingly, the slider and the line chart did not show a difference. The author here believes that the text feedback has provided enough information. In contrast, the overview effect provided by the line chart has little impact on the results. On the contrary, for users without a visual background, exploring the usage of the line chart will occupy more time.

Figure 7 Decision-making situation

There are also some other interesting discoveries. Using both task decomposition and surprise lists here can improve time estimates, but previous studies have shown that neither approach alone leads to significant improvements. Also, users tend to give time estimates in multiples of 5, such as 15 minutes, 20 minutes, and rarely 17 minutes. In addition, users’ decision-making tradeoffs have little to do with whether they themselves are often late or early.

This paper proposes a predictive visualization method that combines traditional debiasing methods and visualization. User experiments show that this method can alleviate the user’s optimism in time estimation and improve the user’s uncertainty in time estimation. But a very important problem is that the effect of this method depends on the user’s division and evaluation of subtasks and unexpected events. If there are already deviations in the evaluation of subtasks and unexpected events, it will lead to problems in the subsequent results. The best solution is to use the real task time instead of this calculated distribution, but this may be difficult in practice. In addition, this work adopts many assumptions in the modeling of user cognition, which may be difficult to reflect the actual cognition of users, and more complex user interfaces may need to be explored later.

references:

[1] Morgane Koval and Yvonne Jansen. Do You See What You Mean? Using Predictive Visualizations to Reduce Optimism in Duration Estimates. In Proceedings of CHI Conference on Human Factors in Computing Systems, Article No. 30, pp. 1–19, 2022 .

This article is reproduced from: http://vis.pku.edu.cn/blog/%E4%BD%A0%E6%98%8E%E7%99%BD%E4%BD%A0%E7%9A%84%E6% 84%8F%E6%80%9D%E5%90%97%EF%BC%9F%E4%BD%BF%E7%94%A8%E9%A2%84%E6%B5%8B%E5%8F% AF%E8%A7%86%E5%8C%96%E9%99%8D%E4%BD%8E%E5%AF%B9%E6%97%B6%E9%97%B4%E4%BC%B0/
This site is for inclusion only, and the copyright belongs to the original author.

Leave a Comment