Running an A/B Experiment¶
An Experiment lets you split your users into groups, show each group a different version of a feature, and measure which version wins. This guide walks you through setting up and running your first A/B test.
Before You Begin¶
Make sure you have:
- At least one Feature Flag created — the flag you want to test (guide here)
- At least one Metric created — what you want to measure (guide here)
Step 1: Define Your Hypothesis¶
Before touching FlagPal, write down your hypothesis. This keeps you honest and makes interpreting results easier.
Format: "We believe that [change] will result in [outcome] because [reason]."
Example: "We believe that changing the checkout button to green will increase the purchase conversion rate because green is associated with positive/go actions."
Step 2: Navigate to Experiments¶
Click Experiments in the left sidebar, then click New Experiment.
Step 3: Fill in the Basic Details¶
Name (required)¶
Be specific — include what you're testing and when.
Examples:
- "Checkout Button Color Test — Q1 2024"
- "Homepage Hero Text A/B Test"
- "Pricing Page Layout Experiment"
Description (optional but recommended)¶
Write your hypothesis here, along with any relevant context.
Step 4: Set the Traffic Percentage¶
The Traffic Percentage controls what portion of your total users enter the experiment.
100%→ All users are eligible for the experiment50%→ Half your users are in the experiment; the other half see the default10%→ Only 10% of users are in the experiment (for cautious rollouts)
Tip: Start with a smaller percentage (10–25%) if you're testing a significant change. Once you're confident there are no issues, increase it.
Step 5: Configure Your Variants¶
Variants are the different versions you're testing. An experiment needs at least two variants.
Adding a Variant¶
- Click Add Variant
- Give the variant a name (e.g., "Control", "Variant A", "Blue Button")
- Set the Feature Flag values for this variant
- (optional) Set the weight for how much traffic this variant receives
Example: Two-Variant Experiment¶
For a button color test with feature flag checkout_button_style:
| Variant Name | Flag Value | Weight |
|---|---|---|
| Control (Blue) | "blue" |
1 |
| Variant B (Green) | "green" |
1 |
Equal weights = 50/50 traffic split.
Unequal Weights¶
You can give variants different weights for an unequal split:
| Variant | Weight | Traffic % |
|---|---|---|
| Control | 9 | 90% |
| New Variant | 1 | 10% |
This is useful for cautious rollouts — test the new variant on a small percentage first.
Three or More Variants¶
FlagPal supports experiments with more than two variants:
| Variant | Flag Value | Weight |
|---|---|---|
| Control | "blue" |
1 |
| Green | "green" |
1 |
| Red | "red" |
1 |
This gives each variant ~33% of experiment traffic.
Step 6: Add Targeting Rules (Optional but Recommended)¶
Like Experiences, you can target your experiment at specific users. For example:
- Only run the experiment on users in the US
- Only include users on the premium plan
- Only include users who have been active in the last 30 days
Targeting rules are useful when: - You want to protect certain users from changes (e.g., those who already have different flows or VIP customers) - You only care about results for a specific segment - You're running multiple experiments and want to avoid overlap
It's considered a good practice to add rules that check for non-existence of the flags you're setting in Variants, to avoid suddenly changing features for users, who already have entered different flows, or enrolling the same users to your experiment more than once:
- Rule 1:
checkout_button_styleequals(empty)
See Best Practices for more details.¶
Step 7: Attach Metrics¶
This is where you define what you're measuring. Click Add Metric and select the metric(s) you want to track for this experiment.
Tip: Pick one primary metric that determines the winner (like purchase_completed) and one impression metric that tracks entrollment (like experiment_started).
Add secondary metrics for additional insight.
Example for a checkout button test:
- Primary/Goal:
purchase_completed(Boolean) — did the user buy? - Impression:
experiment_started(Integer) — the user started the experiment - Secondary:
purchase_revenue(Money) — how much did they spend?
Step 8: Activate the Experiment¶
Once everything is configured, activate the experiment:
- Review all settings
- Click Activate or toggle the Active status to on
Double-check before activating Once an experiment is running, changing the variants or weights can skew your results. Make sure everything is correct before you start. If your setup is not correct, it is better to stop the experiment and start again, unless you're an advanced user.
Step 9: Wait for Data¶
This is the hardest part — patience! Let the experiment run until you have enough data for meaningful results.
Guidelines:
- Run for at least 1-2 weeks to capture different days of the week
- Make sure each variant has at least 100+ users (more is better)
- Don't stop the experiment just because one variant looks like it's winning after a day or two — early results are often misleading
See Best Practices for more details.¶
Step 10: View Results¶
While the experiment is running (and after it ends), click on the experiment to see results.
The Results View Shows:¶
- Per-variant metrics — how each variant performed on each metric
- Charts — visual trends over time
- Statistical context — is the difference real or could it be chance?
Interpreting Results¶
See Reading Experiment Results for more details.
| Scenario | What It Means |
|---|---|
| Variant B has 5.8% conversion vs. Control's 4.2% | Variant B is performing better |
| Both variants have ~4% conversion | No clear winner — continue the experiment or accept null result |
| Variant B has 2% conversion vs. Control's 4.2% | Variant B is hurting performance — stop it |
Step 11: Declare a Winner¶
Once you have statistically significant results:
- Stop the experiment by deactivating it
- Roll out the winner using an Experience — create an Experience that sets the winning flag value for all users
- Document the result — write down what you learned (win, loss, or inconclusive)
Replicating an Experiment¶
If you want to run a follow-up experiment (e.g., testing the winning variant against a new idea), use the Replicate feature. This creates a copy of the experiment that you can modify and run again.
To replicate:
- Go to the experiment you want to copy
- Find and click the Replicate option
- Modify the copy as needed and activate it
Common Mistakes to Avoid¶
❌ Stopping too early — results change as more data comes in. Be patient.
❌ Testing too many things at once — keep experiments focused on one change.
❌ Ignoring negative results — a losing variant is valuable data. Document it.
❌ No hypothesis — define what you're trying to prove before you run the test.
❌ Wrong metrics — measure something that's directly connected to the feature you're testing.
Related Guides¶
- Creating a Feature Flag →
- Tracking Metrics →
- Creating an Experience → — for rolling out the winner