Data Analysis
A/B Test results can be viewed in the Data Analysis tab on the A/B Test detail page.
Once the test is correctly implemented, traffic is allocated, and users are exposed to the A/B Test, test results are updated within 1 hour. Results continue to be updated approximately every 1 hour thereafter.
Events from users exposed to each test group (Group A, Group B, etc.) are collected, the pre-registered goals are calculated, and statistical analysis results are provided together.
Items Provided in A/B Test Results
A/B Test results are provided based on the goals registered before the test starts. The items available on the A/B Test results page are as follows:
Common
Items that can be viewed regardless of the registered goal.
Total Exposed Users: The total number of users exposed to the A/B Test. This is updated together with goal values at most once per hour after the test starts.

A summary of the results for all goals registered in the A/B Test can be viewed in the area below "Total Exposed Users."

Hackle provides A/B Test results using two statistical analysis methods: Frequentist and Bayesian. You can select the desired statistical analysis method from among the data analysis options.

When Frequentist statistical analysis is selected, A/B Test results are displayed in the following format:

When Bayesian statistical analysis is selected, A/B Test results are displayed in the following format:

The following items can be viewed in goals. p-value and confidence interval are available when Frequentist statistical analysis is selected.
Total Exposed Users: The number of users exposed to each test group.
Improvement Rate vs. Group A: The difference in the goal value measured in each treatment (Group B, Group C, ...) compared to Group A. Represents the relative change rate; the value in parentheses indicates the absolute difference.
Baseline: The goal value of Group A.
p-value: Describes how statistically significant the improvement rate compared to Group A is. p-value ranges from 0 to 1, and a value less than 0.05 is generally considered statistically significant.
Confidence Interval: Shows the range of values that the actual "Improvement Rate vs. Group A" can take. When the test result is significant, the confidence interval contains only positive or only negative values.

For the first two columns of items in the goals described below, the names may differ depending on the denominator event selected during goal registration and the calculation types for the denominator/numerator.
Some goal type screens can be found below.
"User Conversion Rate" Goal
Total Event Users: The number of users in each test group who triggered the event. Since this is calculated based on Unique Visitors, even if a single user triggers the same event multiple times, it is counted only once.
User Conversion Rate: The
Total Event Usersdivided byTotal Exposed Usersper test group.

"Average Value per User" Goal
Total Event Value: The sum of all values provided as the event property for events triggered by users exposed to each test group. For example, if the event is
purchase completeand the property ispurchase amount, thenTotal Event Valueis the total purchase amount of all users.Average Value per User:
Total Event Valuedivided byTotal Exposed Users. Using thepurchase amountexample, this represents the average purchase amount per user exposed to each test group.

"Average Count per User" Goal
Total Event Count: The sum of all event occurrences triggered by users exposed to each test group. For example, if the event
like button clickis selected,Total Event Countis the total number of times users clicked thelike button.Average Count per User:
Total Event Countdivided byTotal Exposed Users. Using thelike buttonexample, this represents the average number of times users in each test group clicked thelike button.

"Time" Goal
Total Time Elapsed: The sum of time it took users exposed to each test group to convert from the start event to the end event. For example, if
Homepage Entry → Login Completeis set as the start/end event pair,Total Time Elapsedis the sum of all time users spent fromHomepage Entry to Login Complete.Average Time Elapsed:
Total Time Elapseddivided byTotal Conversions. Using theHomepage Entry → Login Completeexample, this represents the average time it took to complete fromHomepage Entry to Login Complete.

Items with Statistical Significance
Descriptions of items that represent statistical significance viewable per goal.
Confidence Interval
Shows the range of values that the Improvement Rate vs. Group A can take. When the test result is significant, the confidence interval contains only positive or only negative values. Hover your mouse over the confidence interval to see the numerical values at both ends.
When the result is significant and positive, the confidence interval is to the right of 0. If the success criterion is set to "Decrease compared to Group A," the opposite applies. In this case, a green Significant badge appears to the right of the test group name. This means the test group is performing significantly better than Group A.

When the result is significant and negative, the confidence interval is to the left of 0. However, in the example below, since the success criterion was set to "Decrease compared to Group A" during goal registration, the result is significant and positive, and the confidence interval is to the right of 0. In this case, a red Significant badge appears to the right of the test group name. This means the test group is performing significantly worse than Group A.

When a conclusion cannot be drawn from the results, 0 is included in the confidence interval. Since no group has achieved significant results, no Significant badge appears for any test group.

Trend Graph
Click Show Graph below each goal to see a graph showing trends by date. (When the graph is visible, the text changes to Hide Graph.)
You can choose from 4 items: the selected goal type, p-value (Frequentist), probability of being the best (Bayesian), and probability of outperforming Group A (Bayesian). The image below is for the case where user conversion rate was selected as the goal.

p-value: A value that expresses the reliability of the A/B Test results. A lower p-value indicates higher reliability. In particular, when less than 0.05, it can be considered statistically significant. In this case, you will see the
Significantbadge.Probability of being the best: The probability that each test group is the best. This value is calculated based on Bayesian Statistics.
Probability of outperforming Group A: The probability that the treatment is better than the control (Group A). This value is also calculated based on Bayesian Statistics.

If you want to know the value at a specific point in the graph, hover your mouse over that point as shown in the image above.
FAQ
Does data from Test Devices also reflect in experiment results?
In general, data from registered Test Devices is not reflected in experiment results. However, if the device was already exposed to the experiment before being registered as a Test Device, the data will be reflected.
Last updated