scatter diagram

A scatter diagram is one of the seven basic tools of quality that many professionals struggle with. 

A scatter diagram is also known as a scatter plot, scatter graph, or correlation chart.

Other diagrams use lines or bars to display data; a scatter diagram uses dots. This may appear to be a confusing approach at first, but it is often easier than others to understand.

An English scientist, John Fredrick W. Herschel, presented the scatter diagram in 1833 in his study of Orbits of Double Stars.

In 1886, the scatter diagram was popularized by an English Victorian-era polymath named Francis Galton. He is also known as the creator of statistical concepts of correlation.

In this blog post, I will explain the scatter diagram. 

Scatter Diagram (Scatter Plot)

A scatter diagram is drawn by including two variables. The first variable is independent, and the second variable is dependent on the first.

scatter-diagram

The scatter diagram is considered the simplest way to study the correlation between these two variables. After determining how they are related, you can predict the behavior of the dependent variable based on the independent variable. 

A scatter plot is useful when one variable is measurable while the other is not.

Definition: According to the PMBOK Guide, a scatter diagram is “a graph that shows the relationship between two variables. Scatter diagrams can show a relationship between elements of a process, environment, or activity on one axis and a quality defect on the other axis.”

Example of Using a Scatter Diagram

You are analyzing accident patterns on a highway. You select the two variables, motor speed and the number of accidents, and draw up the diagram.

Once the drawing is complete, you notice that the number of accidents increases as the speed of vehicles increases. This reveals the correlation between the two.

In most cases, the independent variable is plotted along the horizontal (x-axis), and the dependent variable is plotted on the vertical (y-axis). The independent variable operates as the control parameter because it influences the behavior of the dependent variable.

It is not necessary to have a controlling parameter to draw a scatter diagram. There can also be two independent variables. In that case, you can use any axis for any variable.

Many professionals believe that a scatter diagram is like a fishbone diagram because the latter includes two parameters: cause and effect. 

Note that these two diagrams are different. The fishbone diagram shows you the effect of a cause; however, it does not show the relationship between these two. The scatter plot helps you analyze the correlation between the two variables.

However, the fishbone or Ishikawa diagram can help you draw a scatter diagram. For example, you can use the fishbone diagram to find the two variables (cause and effect) and then use the scatter diagram to analyze their relationship.

Types of Scatter Diagrams

You can classify scatter diagrams in many ways. I will discuss the two most popular based on correlation and slope of the trend. These are the most common in project management.

Depending on the correlation, you can divide scatter diagrams into the following categories:

  • Scatter Diagram with No Correlation
  • Scatter Diagram with Moderate Correlation
  • Scatter Diagram with Strong Correlation

Scatter Diagram with No Correlation

This diagram is known as the “Scatter Diagram with Zero Degree of Correlation.”

scatter-diagram-with-no-correlation

Here, the data point spread is so random that you cannot draw a line through them.

Therefore, you can conclude that these variables do not correlate.

Scatter Diagram with Moderate Correlation

This plot is known as a “Scatter Diagram with a Low Degree of Correlation.”

scatter-diagram-with-moderate-correlation

Here, the data points are a little closer, and you can see a relationship between these variables.

Scatter Diagram with Strong Correlation

This diagram is known as a “Scatter Diagram with a High Degree of Correlation.”

In this diagram, the data points are close, and you can draw a line by following their pattern.

scatter-diagram-with-strong-correlation

In this case, you conclude that these variables are closely related.

As discussed earlier, you can categorize the scatter diagram according to the slope, or trend, of the data points:

  • Scatter Diagram with Strong Positive Correlation
  • Scatter Diagram with Weak Positive Correlation
  • Scatter Diagram with Strong Negative Correlation
  • Scatter Diagram with Weak Negative Correlation
  • Scatter Diagram with Weakest (or no) Correlation

A strong positive correlation means a visible upward trend from left to right; a strong negative correlation means a visible downward trend from left to right. A weak correlation means the trend is less clear. A flat line is the weakest correlation, from left to right, as it is neither positive nor negative. A scatter diagram with no correlation shows that the independent variable does not affect the dependent variable.

Scatter Diagram with Strong Positive Correlation

scatter-diagram-with-strong-positive-correlation

This diagram is known as a “Scatter Diagram with Positive Slant.”

In a positive slant, the correlation is positive, i.e., as the value of X increases, the value of Y will increase. You can say that the slope of a straight line drawn along the data points will go up. The pattern resembles a straight line.

For example, cold drink sales will increase if the weather gets hotter.

Scatter Diagram with Weak Positive Correlation

scatter-diagram-with-weak-positive-correlation

As the value of X increases, the value of Y also increases, but the pattern does not resemble a straight line.

Scatter Diagram with Strong Negative Correlation

scatter-diagram-with-strong-negative-correlation

This diagram is known as a “Scatter Diagram with a Negative Slant.”

In the negative slant, the correlation is negative, i.e., as the value of X increases, the value of Y will decrease. The slope of a straight line drawn along the data points will go down.

For example, if the temperature increases, the sale of winter coats goes down.

Scatter Diagram with Weak Negative Correlation

scatter-diagram-with-weak-negative-correlation

As the value of X increases, the value of Y will decrease, but the pattern is not clear.

Scatter Diagram with No Correlation

There isn’t any relationship between the two variables to be seen. It might be a series of points with no visible trend or a straight, flat row of points. In either case, the independent variable does not affect the second variable; it is not dependent.

Limitations of a Scatter Diagram

  • Scatter diagrams cannot give you the exact extent of potential correlation.
  • A scatter diagram does not show a quantitative measurement of the relationship between the variables. It only shows the quantitative expression of quantitative change.
  • This chart does not show you the relationship for more than two variables.

Benefits of a Scatter Diagram

  • It shows the relationship between two variables.
  • It is the best method to map out a non-linear pattern.
  • The range of data flow, like the maximum and minimum values, can be determined.
  • Patterns are easy to observe.
  • Plotting the diagram is simple.

When You Should Use a Scatter Diagram

You should use the scatter diagram in the following cases:

  • If two variables pair well together, you can draw a scatter plot to see their relation and correlation. For example, working hours versus earnings.
  • To figure out if two variables share a relation. For example, if there is any relation between the temperature rise with the equipment malfunctioning.

Points to Remember While Plotting Scatter Diagram

  • It is not always guaranteed that two variables share a relationship if the chart shows a correlation. It can be a coincidence or caused by a third variable.
  • You can plot the scatter diagram when you have a large amount of data.
  • The more the data resemble a straight line, the stronger the correlation.
  • Data coverage should be wide for plotting a scatter chart.

Summary

Scatter diagrams are useful in determining the relationship between two variables. This relationship can be between two causes or a cause and an effect. It can be positive, negative, or not correlated at all. 

The first variable is independent, and the second variable is dependent on the first. To analyze the pattern of the relationship, you change the independent variable and monitor the changes in the dependent one. A scatter diagram can have two independent variables.

A scatter diagram is an important concept from a PMP exam point of view. Please understand it well.

Fahad Usmani, PMP

I am Mohammad Fahad Usmani, B.E. PMP, PMI-RMP. I have been blogging on project management topics since 2011. To date, thousands of professionals have passed the PMP exam using my resources.