Skip to content

Running an A/B Experiment

An Experiment lets you split your users into groups, show each group a different version of a feature, and measure which version wins. This guide walks you through setting up and running your first A/B test.


Before You Begin

Make sure you have:

  • At least one Feature Flag created — the flag you want to test (guide here)
  • At least one Metric created — what you want to measure (guide here)

Step 1: Define Your Hypothesis

Before touching FlagPal, write down your hypothesis. This keeps you honest and makes interpreting results easier.

Format: "We believe that [change] will result in [outcome] because [reason]."

Example: "We believe that changing the checkout button to green will increase the purchase conversion rate because green is associated with positive/go actions."


Step 2: Navigate to Experiments

Click Experiments in the left sidebar, then click New Experiment.


Step 3: Fill in the Basic Details

Name (required)

Be specific — include what you're testing and when.

Examples:

  • "Checkout Button Color Test — Q1 2024"
  • "Homepage Hero Text A/B Test"
  • "Pricing Page Layout Experiment"

Write your hypothesis here, along with any relevant context.


Step 4: Set the Traffic Percentage

The Traffic Percentage controls what portion of your total users enter the experiment.

  • 100% → All users are eligible for the experiment
  • 50% → Half your users are in the experiment; the other half see the default
  • 10% → Only 10% of users are in the experiment (for cautious rollouts)

Tip: Start with a smaller percentage (10–25%) if you're testing a significant change. Once you're confident there are no issues, increase it.


Step 5: Configure Your Variants

Variants are the different versions you're testing. An experiment needs at least two variants.

Adding a Variant

  1. Click Add Variant
  2. Give the variant a name (e.g., "Control", "Variant A", "Blue Button")
  3. Set the Feature Flag values for this variant
  4. (optional) Set the weight for how much traffic this variant receives

Example: Two-Variant Experiment

For a button color test with feature flag checkout_button_style:

Variant Name Flag Value Weight
Control (Blue) "blue" 1
Variant B (Green) "green" 1

Equal weights = 50/50 traffic split.

Unequal Weights

You can give variants different weights for an unequal split:

Variant Weight Traffic %
Control 9 90%
New Variant 1 10%

This is useful for cautious rollouts — test the new variant on a small percentage first.

Three or More Variants

FlagPal supports experiments with more than two variants:

Variant Flag Value Weight
Control "blue" 1
Green "green" 1
Red "red" 1

This gives each variant ~33% of experiment traffic.


Like Experiences, you can target your experiment at specific users. For example:

  • Only run the experiment on users in the US
  • Only include users on the premium plan
  • Only include users who have been active in the last 30 days

Targeting rules are useful when: - You want to protect certain users from changes (e.g., those who already have different flows or VIP customers) - You only care about results for a specific segment - You're running multiple experiments and want to avoid overlap

It's considered a good practice to add rules that check for non-existence of the flags you're setting in Variants, to avoid suddenly changing features for users, who already have entered different flows, or enrolling the same users to your experiment more than once:

  • Rule 1: checkout_button_style equals (empty)

See Best Practices for more details.

Step 7: Attach Metrics

This is where you define what you're measuring. Click Add Metric and select the metric(s) you want to track for this experiment.

Tip: Pick one primary metric that determines the winner (like purchase_completed) and one impression metric that tracks entrollment (like experiment_started). Add secondary metrics for additional insight.

Example for a checkout button test:

  • Primary/Goal: purchase_completed (Boolean) — did the user buy?
  • Impression: experiment_started (Integer) — the user started the experiment
  • Secondary: purchase_revenue (Money) — how much did they spend?

Step 8: Activate the Experiment

Once everything is configured, activate the experiment:

  1. Review all settings
  2. Click Activate or toggle the Active status to on

Double-check before activating Once an experiment is running, changing the variants or weights can skew your results. Make sure everything is correct before you start. If your setup is not correct, it is better to stop the experiment and start again, unless you're an advanced user.


Step 9: Wait for Data

This is the hardest part — patience! Let the experiment run until you have enough data for meaningful results.

Guidelines:

  • Run for at least 1-2 weeks to capture different days of the week
  • Make sure each variant has at least 100+ users (more is better)
  • Don't stop the experiment just because one variant looks like it's winning after a day or two — early results are often misleading

See Best Practices for more details.

Step 10: View Results

While the experiment is running (and after it ends), click on the experiment to see results.

The Results View Shows:

  • Per-variant metrics — how each variant performed on each metric
  • Charts — visual trends over time
  • Statistical context — is the difference real or could it be chance?

Interpreting Results

See Reading Experiment Results for more details.

Scenario What It Means
Variant B has 5.8% conversion vs. Control's 4.2% Variant B is performing better
Both variants have ~4% conversion No clear winner — continue the experiment or accept null result
Variant B has 2% conversion vs. Control's 4.2% Variant B is hurting performance — stop it

Step 11: Declare a Winner

Once you have statistically significant results:

  1. Stop the experiment by deactivating it
  2. Roll out the winner using an Experience — create an Experience that sets the winning flag value for all users
  3. Document the result — write down what you learned (win, loss, or inconclusive)

Replicating an Experiment

If you want to run a follow-up experiment (e.g., testing the winning variant against a new idea), use the Replicate feature. This creates a copy of the experiment that you can modify and run again.

To replicate:

  1. Go to the experiment you want to copy
  2. Find and click the Replicate option
  3. Modify the copy as needed and activate it

Common Mistakes to Avoid

Stopping too early — results change as more data comes in. Be patient.

Testing too many things at once — keep experiments focused on one change.

Ignoring negative results — a losing variant is valuable data. Document it.

No hypothesis — define what you're trying to prove before you run the test.

Wrong metrics — measure something that's directly connected to the feature you're testing.