Data Visualisation to showcase uncertainty for Newswhip.com

About Newswhip

NewsWhip is a social media analytics product that tracks content by amount and location of user engagement. It also tracks audience interests and changes in interests over time.

NewsWhip's products are named Spike and Analytics. Spike is used to predict content engagement in real time, while Analytics is a database of historical engagement data for customers across social networks.


Problem Statement

Despite the improved accuracy our Predictions, there is a level of uncertainty that the users face while understanding prediction in the product currently. Predictions with a single-point estimate sets a false sense of certainty in the predicted value and which may cause users to doubt the products prediction abilities. Visually communicating the level of uncertainty will set more realistic expectations for users, as well as helping them to understand the limitations of the model.


Jobs to be done

  • As a PR strategist I want to see the actual, predicted and public response of a article or a post that could impact my brand which will help me make a informed decision of how to strategies the brands reaction.

  • As a PR strategist I want my clients to trust my recommendation to a client during a crisis by having evidence to back my recommendation.

  • As a PR strategist I want to stay aware of stories that might be important to my client and have a sense of their future performance.


User Flow

Desktop.png

Study of types of data visualisations for uncertainty in prediction

Fan chart

In time series analysis, a fan chart is a chart that joins a simple line chart for observed past data, by showing ranges for possible values of future data together with a line showing a central estimate or most likely value for the future outcomes. As predictions become increasingly uncertain the further into the future one goes, these forecast ranges spread out, creating distinctive wedge or "fan" shapes, hence the term. Alternative forms of the chart can also include uncertainty for past data, such as preliminary data that is subject to revision.
https://en.wikipedia.org/wiki/Fan_chart_(time_series)

confidence fan.jpeg

 

Confidence Interval Chart

In statistics, a confidence interval (CI) is a type of interval estimate, computed from the statistics of the observed data, that might contain the true value of an unknown population parameter. The interval has an associated confidence level, or coverage that, loosely speaking, quantifies the level of confidence that the deterministic parameter is captured by the interval. More strictly speaking, the confidence level represents the frequency (i.e. the proportion) of possible confidence intervals that contain the true value of the unknown population parameter. In other words, if confidence intervals are constructed using a given confidence level from an infinite number of independent sample statistics, the proportion of those intervals that contain the true value of the parameter will be equal to the confidence level.

Confidence intervals consist of a range of potential values of the unknown population parameter. However, the interval computed from a particular sample does not necessarily include the true value of the parameter. Based on the (usually taken) assumption that observed data are random samples from a true population, the confidence interval obtained from the data is also random.

https://en.wikipedia.org/wiki/Confidence_interval

confidence-bars.png

Design Explorations

Exploration 1: Dual Fan chart

The prediction model that we wanted to release with the new visualisation had 2 options of error margins the 80-20 model and the 90-10 model. The first exploration was to provide both the models together to enable users to make a more informed decision but early testing taught us that showing both the models added to a users confusion.

Article+details+option+2+Copy+2.png

Exploration 2: Single Fan chart

Due to the restrictions of the dual fan chart made us choose only one of the model for show casing the error margins in prediction. This lead to a design exploration with a single fan chart. In the user testing of these it was evident that they worked well but the user was not sure of the time interval of the prediction.

Article+details+option+2+Copy+3.png

Exploration 3: Interval chart

An interval chart is used to show the point of time in the future for which the prediction has been generated by the algorithm. The Interval chart consists of a point along with a vertical line that indicated the upper and lower bounds of the error margin of prediction. This Chart did not fair well with our users as it only showed one point in time but did not show case the range of prediction that was seen in the previous fan charts.

Article+details+-+CI.png

Exploration 4: Fan & interval chart combined

Upon viewing the reactions of our users to both the charts and the various levels of feedback for each chart I decided to combine a fan chart and the interval chart below and user test it. This chart performed the best and the feedback received was that users could clearly understand the prediction range and the point of time for which the prediction was made for.

Article+details+option+2+Copy.png

Solution -

The final data visualisation was the combination of a fan chart and an interval chart so that the user can clearly understand the error margin and the interval point in time for which the prediction was made for. This solution clearly helped the users of Newswhip to understand the predictions feature in the product.

sample1.png

Outcome:

The new prediction visualisation made the users of Newswhip (a PR strategist) more confident about the prediction in the product which helped them make stronger recommendation to their clients.

The success of the feature and visualisation was measured with qualitative feedback received from our customers and increase in sales using this feature

Positive feedback on the new prediction range from a PR strategist of a UK based PR agency: “It makes me more confident communicating it to clients, there’s more acknowledgement that it (the score) can vary” 


Learnings

  • A early research of types of data visualisation guided the project

  • Close collaboration with a data researcher and the engineering team helped in finding technical limitations and der-risking them early in the project

  • Quick iterations are key to an agile process

  • Constant user validation was very useful