L&D leaders spend a substantial amount of their time evaluating whether training works; a large part of business strategy depends on the quality of employees’ work, and training is supposed to lead to improved organizational outcomes. To assess whether training is leading to the desired outcomes, there are two scientific approaches: experimental/quasi-experimental designs and comparison to standards.

An experimental design randomly assigns some employees to attend training and some employees not to attend training. With the exception of the training, groups are treated equally, and the training team collects identical measures before, during and after the event. To analyze the data, they compare the results from the training group to the results of the control group and assume that the difference in results is most likely due to training. Quasi-experimental designs follow the same principles, but participants are not randomly assigned to training conditions. Thus, the results have to be considered with caution.

While experimental and quasi-experimental approaches are the most rigorous methodologies, they are also the most time-consuming and resource-intensive approaches. Thus, they are not practical, except for a few course evaluations each year.

The comparison-to-standards approach, on the other hand, collects standard measures from learners and compares them to a benchmark, or best practice measure, created by aggregating historical performance metrics. The best comparisons occur when the data is collected using the same instruments, such as surveys; focus groups; interviews; or system-based operational measures, such as sales, transactions and financial information.

L&D professionals can also benefit from experimental impact studies: It is possible to combine the comparison-to-standards approach with experimental design for a small set of training programs that are strategic, visible and costly.

Internal versus External Benchmarks

There are two common types of benchmarks: internal and external. Internal benchmarks allow L&D leaders to compare course effectiveness across the organization’s curriculum and over consecutive periods. They can use these benchmarks as a performance bar to determine if various aspects of courses are meeting, exceeding or falling short of standards.

External benchmarks allow L&D professionals to compare their training effectiveness data to broader industry standards. Most organizations strive to be leaders in their industry. This healthy rivalry often drives improvement, which results in organizations’ outperforming the competition. It is, therefore, important to identify leading industry companies that can provide you with industry benchmarks.

Special Considerations for Benchmark Comparison

1. Comparability

Benchmarks must allow you to compare apples to apples. For example, a company in the pharmaceuticals industry should compare itself to other companies in the pharmaceuticals industry. Course topics and learning methodologies, such as instructor-led courses, self-paced e-learning or virtual instructor-led courses also provide excellent points of comparison.

2. Alignment with a Learning Model

It is a good idea to collect benchmarks associated with a specific learning model and compare course scores to benchmark learning values. A few popular models include Metrics that Matter™’s Predictive Learning Impact Model™, the Kirkpatrick model and Jack Phillips’ ROI Methodology™. If your organization uses one of these models, it is important to ensure that all the data you collect is aligned to that model.

3. Dashboard and Reporting Automation

With an automated system, L&D managers can distribute surveys at the end of a course and preview the results within minutes. Some systems build analytics into the tool, and it takes little effort to create descriptive reports. Many already offer customizable dashboards, enabling training leaders to compare results to pre-defined benchmarks for each measure.

Making Sure Your Benchmark Is Valid

Benchmarks must rely on valid tools, which is why it’s important to use validated surveys that align with learning. There are several other important parameters when assessing the validity of a benchmark:

Historical Depth

The longer the tool uses a set of benchmark data for comparison, the higher the level of validity. Inquire whether the benchmark data is based on, for example, three months or three years.

Benchmark Segmentation

The more a benchmark offers the flexibility to compare measures from a specific segment, the more valid the comparison will be. Common benchmark segmentations are based on industry (e.g., defined by Dow Jones), course type (e.g., defined by Association for Talent Development), and, more broadly, job type and international geographies.

Large Base of Organizations

Valid benchmarks are based on measures taken from a large number of companies. If the benchmark is limited to a few organizations, once you slice and dice the measures into segments, you compromise the integrity of the data.

For example, a director at an international hotel chain used the benchmark approach to help her decide against improving a course. Several of her staff tried to make the case that a course needed revamping. By looking at the course results for the year and comparing them to benchmarks, she learned that the course was performing above benchmarks in all areas.

The director decided to make no changes to this course. Instead, she invested her valuable resources in improving courses that underperformed benchmarks. To make this important decision, the director used a summary table similar to this one (based on recent research into a learning impact model):

example chart
Values are included here for example purposes only; benchmark values are not actual benchmark scores.

This table shows how easy it is to determine if a course is performing well. You can modify the first column of the table to reflect the metrics that are important to your L&D organization. The scores in the second and third columns reflect the percentage of learners who selected the top two boxes on the rating scale (“agree” or “strongly agree”) when responding to questions on the course evaluation. Instead of percentages, you could use the average Likert scale score (e.g., 4.50). In this way, you can assess quality across the curriculum and make fact-based decisions about how and when to improve training, just like the hotel director.

The comparison-to-standards approach is the most practical, scalable and cost-effective method to collect learning effectiveness data. Benchmarks are an essential part of the comparison-to-standards approach, because they establish a performance standard, which helps L&D professionals assess the quality of their training programs.