All blogs · Written by Ajitesh

A/B Testing PM Interview Questions - Facebook Ads Revenue Case Study

A/B Testing PM Interview: How to Ace the Facebook Ads Revenue Question

Welcome to the fifth edition of PM Interview Prep Weekly! I’m Ajitesh, and this week we’re diving into one of the most technical yet business-critical areas of product management: A/B testing for revenue optimization.

The Context

A/B testing questions are particularly common when recruiting for consumer-tech PM roles at firms like Facebook, Google, and other data-driven organizations.

Having worked as a PM in both consumer tech and enterprise spaces, I’ve seen the stark differences in how A/B testing manifests. At Fitso and Sqrrl - consumer-tech startups I was part of, A/B testing was pretty routine. Almost every new feature launch. Marketing campaigns. In consumer tech, data is plentiful and user verdicts come fast.

When working at Cisco and Google Cloud—enterprise territory, I found A/B testing rare. Why? You have fewer users, long development cycles, and a few large customers can determine the fate of the product. Our backend APIs would go through alpha, beta, and GA (Generally Available) releases with one-year deprecation periods. No room for experimentation there—enterprise customers demand stability. Instead, a lot of work at the start to get the product and design decisions right.

But here’s what fascinated me: even within Google Cloud, our Pantheon UI team (code name for Google Cloud frontend) ran experiments constantly. While the backend stayed stable, the frontend tested everything—how to present features, where to place buttons, which workflows converted better.

Google’s culture of experimentation runs deep. There’s a famous story (possibly false, but illustrative) about a PM who changed the font color in Google Search and increased revenue by $1 billion, got promoted, then a year later someone changed it back and also got promoted. Was it novelty effect? Primacy bias? Who knows—but it shows how even tiny changes can have massive impacts when you’re operating at scale.

This dual experience taught me something crucial: while consumer-oriented companies obsess over A/B testing and metrics (rightfully so), B2B companies often undervalue these skills. But the ability to think experimentally, interpret metrics, and understand user behavior is valuable everywhere. Even in my enterprise role, understanding A/B testing principles helped me make sense of our UI team’s decisions and spot patterns in user adoption data.

The setup: Ads represent one of the trickiest three-sided marketplaces: users want relevant experiences, advertisers want ROI, and the platform wants revenue.

Facebook generates over $100 billion annually from ads—that’s roughly 97% of their revenue. When a PM at Facebook designs an A/B test, they’re not just optimizing a feature. They’re making decisions that could impact billions in revenue while balancing user experience and advertiser success.

The question: Today we’re tackling—“What are the top 3 types of A/B experiments you would run on Facebook ads to increase revenue?“.

Let’s break down exactly how to nail this question and the framework that makes A/B testing interviews manageable.

P.S. One of my PM mentors from YouTube once shared: “A good PM celebrates results; a great PM questions them.” This is especially true for A/B testing. Say you run a test that shows a 3% increase in CTR—before scaling to 100% of users, question everything about it. Was it novelty effect? Did it accidentally target a specific segment? Will it sustain over time? The same scrutiny applies to negative results. At Google’s scale, even small misinterpretations can have massive consequences.

Approach to Solving A/B Testing Questions

A/B testing questions can feel overwhelming because they combine statistical concepts with business strategy.

Here’s the thing though—A/B testing, for better or worse, has been done to death by organizations across healthcare, tech, and beyond. The approach has become remarkably standardized and theoretical. Whether you’re designing experiments at a startup, Google or interview, the path is approch is almost same.

What I’m sharing today is the approach I follow, but talk to any data scientist or PM at Google, Facebook, Amazon, or any large consumer tech company, and you’ll hear similar patterns. Pick up any book on experimentation or data science, and you’ll find the same core principles.

The Three-Step Framework:

Step 1: Define Your Hypotheses

Every A/B test starts with a clear hypothesis pair:

Null Hypothesis (H₀): In A/B testing, the null hypothesis states that there is no difference between the control and variant group. This is your default assumption until proven otherwise.

Alternative Hypothesis (H₁): The alternative hypothesis states that there is a measurable difference between the control and variant group. This is what you’re trying to prove.

The goal is to determine whether to reject the null hypothesis in favor of the alternative with statistical significance.

Step 2: Apply the PICOT Methodology

The PICOT framework, commonly used in healthcare research, adapts perfectly to structure A/B testing approaches:

  • Population: Target audience for the test (e.g., mobile app users)
  • Intervention: The change or variant being tested relative to the control—a measurable product change like altering button colors or modifying checkout processes
  • Comparison: Existing product without any change (the control group)
  • Outcome: The key metrics that define intervention impact. These become your main evaluation criteria and often represent the most critical part of the interview
  • Time: Duration for running the experiment before analyzing data and reaching conclusions

Example Application: P (mobile app users in the US aged 18-35), I (increased font size in onboarding flow), C (original font size), O (completion rate of onboarding), T (2 weeks).

Step 3: Watch for Biases and Statistical Significance

When analyzing A/B testing results, several critical considerations emerge:

Novelty Effect: Users may engage more with new or changed products initially due to curiosity, but this effect might not be sustainable long-term. For example, if a messaging app introduces animated emojis, users may send more emojis initially, but this could just be a temporary spike. Always account for novelty effects by looking for persistent changes over longer periods.

Primacy Effect: Conversely, users often prefer and stick with original product versions. They may resist changes or provide negative feedback when redesigns are first introduced because people prefer familiarity. Long-time users are especially prone to this effect. When redesigning products, negative initial feedback could be temporary primacy effect, so monitor user sentiment and behavior over longer terms.

Dealing with Interference: Care must be taken to avoid treatment groups interfering with control groups. For example, if testing a new navigation menu, users in the test group could discuss the new menu on forums seen by control group users. Without clean separation, results get skewed as the control is no longer a true baseline.

Statistical Significance: Determine whether results are statistically significant using p-values. This represents the probability of obtaining results at least as extreme as observed results, assuming the null hypothesis is true. Values range from 0 to 1, with p-values less than 0.05 (5%) conventionally considered statistically significant for accepting alternative hypotheses.

This framework provides structure while leaving room for creative thinking. The critical part is thinking through the right metrics and showcasing that you understand how experiments are set up.

The Case Study

Interviewer: “You’re a PM at Meta/Facebook on the Ads team. What are the top 3 types of A/B experiments you would run on Facebook ads to increase revenue?”

My Solution Using the outlined approach

Step 1: Clarifying Questions and Context

First, let me share my understanding of Facebook’s ads ecosystem:

Facebook generates revenue primarily through an auction system where advertisers bid to show ads to users. The key revenue metric is eCPM (effective Cost Per Mille)—how much revenue Facebook earns per thousand ad impressions. Success depends on balancing three stakeholders: users who want relevant, non-intrusive ads; advertisers who want strong ROI and reach; and Facebook, which wants to maximize revenue sustainably.

To increase revenue, we can optimize eCPM through better auction mechanics, improved ad relevance, or new ad formats that drive higher engagement and conversion.

My assumption: We want experiments that can drive meaningful revenue impact (double-digit percentage improvements in their segments) while preserving advertiser ROI and user experience.

Let me brainstorm potential experiments across different revenue levers:

  1. AR Try-On Ads - Let users virtually try products (makeup, glasses, clothes) using AR
  2. ML-Powered Audience Expansion - Auto-expand targeting using machine learning to find similar high-value users
  3. AI Avatar Brand Ambassadors - Interactive AI representatives that can answer questions and qualify leads
  4. Social Proof Overlays - Show real-time engagement metrics (views, purchases) on ads
  5. Sequential Retargeting Campaigns - Multi-step ad journeys that tell a story across touchpoints

Now, let me prioritize based on:

  • Implementation feasibility (can we build this in reasonable time?)
  • Revenue potential (how much lift can we expect?)
  • Strategic alignment (does it leverage Meta’s unique strengths?)

I’ll focus on the first three. While Social Proof Overlays could drive urgency and Sequential Retargeting could improve conversion through storytelling, the first three experiments better leverage Meta’s investments in AR/VR and AI and have higher revenue potential in the long run, even if a bit high on implementation feasibility.

Step 2: Three High-Impact Experiments

Note on metrics: Since our goal is to increase ads revenue, eCPM remains the primary metric across all experiments—it’s the direct measure of revenue per thousand impressions. What varies is the mechanism (secondary metrics) through which we achieve that eCPM lift and what could go wrong (guardrail metrics) with each approach.

Experiment 1: AR Try-On Ads (Immersive Ad Innovation)

Problem Statement: E-commerce advertisers, especially in beauty and fashion, face a fundamental challenge—users can’t physically try products before buying, leading to low conversion rates and thus lower eCPMs (effective Cost Per Mille) for these ad types.

Hypothesis Formation:

  • Null Hypothesis (H₀): AR try-on features have no impact on ad revenue
  • Alternative Hypothesis (H₁): AR try-on ads will meaningfully increase eCPM through higher conversion rates
  • Reasoning: Sizing concerns and “how will it look on me” uncertainty are my top reasons for abandoning cart, and I think there are many like me.

PICOT Application:

  • Population: Fashion, beauty, and eyewear advertisers currently using catalog ads, starting with top 1000 advertisers by spend
  • Intervention: Implement AR camera effects that let users virtually try products:
    • Makeup: Real-time facial tracking for lipstick, foundation
    • Eyewear: Glasses/sunglasses overlay
    • Accessories: Watches, jewelry placement
  • Comparison: 50/50 A/B split; control sees standard carousel/collection ads
  • Outcome:
    • Primary - eCPM;
    • Secondary - conversion rate, time in ad
    • Guardrail - load time, crash rate
  • Time: 6 weeks to account for user adoption curve

Trade-offs: While AR drives engagement, it requires newer devices and more bandwidth. We’d need fallbacks for older phones. Implementation costs are high, but the potential for transforming e-commerce advertising is significant.

Experiment 2: ML-Powered Audience Expansion (Targeting Optimization)

Problem Statement: Many SMB advertisers use overly narrow targeting, limiting their reach and Facebook’s revenue potential. They lack the expertise to find similar high-value audiences.

Hypothesis Formation:

  • Null Hypothesis (H₀): ML audience expansion doesn’t affect revenue per advertiser
  • Alternative Hypothesis (H₁): Intelligent audience expansion will increase revenue per advertiser substantially
  • Reasoning: There’s significant untapped potential in helping SMBs reach similar audiences through machine learning, which should drive meaningful revenue growth.

PICOT Application:

  • Population: Conversion-optimized campaigns with <10K audience size (primarily SMB advertisers)
  • Intervention: ML model expands targeting by finding similar users based on:
    • Behavioral patterns across Facebook’s ecosystem
    • Purchase history and intent signals
    • Cross-advertiser learnings while preserving privacy
  • Comparison: Control maintains original targeting parameters
  • Outcome:
    • Primary - eCPM
    • Secondary - reach expansion, ROAS
    • Guardrail - advertiser opt-out rate
  • Time: 2 weeks minimum to account for ML model learning phase

Key Implementation: Show advertisers which expanded segments perform best, allow opt-out after seeing results, and maintain ROAS thresholds to preserve advertiser trust.

Experiment 3: AI Avatar Brand Ambassadors (Conversational Ads)

Problem Statement: Traditional ads are one-way communication. Users have questions but must leave Facebook to get answers, reducing conversion rates and ad effectiveness, especially for service-based businesses.

Hypothesis Formation:

  • Null Hypothesis (H₀): AI avatars don’t impact ad revenue
  • Alternative Hypothesis (H₁): Conversational AI avatars will drive higher eCPM through better engagement and lead quality
  • Reasoning: Service-based advertisers need qualification conversations. AI avatars providing this at scale should drive significant revenue improvements through better lead quality.

PICOT Application:

  • Population: Service advertisers (insurance, education, travel, real estate) currently using lead generation ads
  • Intervention: AI-powered brand representatives that can:
    • Answer product questions in real-time
    • Provide personalized recommendations based on user inputs
    • Book appointments or schedule demos directly
    • Remember conversation context for follow-up interactions
  • Comparison: Standard lead generation ads with forms
  • Outcome:
    • Primary - eCPM
    • Secondary - lead quality score, conversation completion
    • Guardrail - inappropriate responses, brand safety
  • Time: 8 weeks to allow proper AI model training and user adoption

Implementation Details: Start with rule-based responses and evolve to LLM capabilities, maintain human fallback for complex queries, and enforce strict brand voice guidelines.

Step 3: Watch for Biases and Statistical Significance

Critical Biases to Consider:

For AR Try-On Ads:

  • Novelty Effect: Users might initially engage heavily with AR features out of curiosity. Monitor for 6+ weeks to see if engagement sustains beyond the novelty phase
  • Selection Bias: Early adopters using AR might already be high-intent buyers

For ML Audience Expansion:

  • Survivorship Bias: Advertisers who opt-out early might be different from those who stay, skewing results
  • Primacy Effect: SMB advertisers might resist the change initially, preferring their manual targeting

For AI Avatar Brand Ambassadors:

  • Novelty Effect: Initial fascination with AI conversations could inflate early metrics
  • Interference: Users might share their AI chat experiences on social media, influencing control group behavior

Statistical Significance Considerations:

  • Need sufficient sample size (likely 1000+ advertisers per experiment) to detect meaningful eCPM lifts
  • Segment analysis by device type, advertiser size, and user demographics to understand true impact

Revenue Impact: While exact projections require historical data, prioritize ML expansion for quickest validation, followed by AR ads, then AI avatars based on implementation complexity.

This concludes the main framework application. The interviewer might ask about specific biases, sample size calculations, or how you’d handle conflicting results across segments.

How to Excel in This Case

  • Brainstorm before narrowing: Always generate 4-5 experiment ideas before selecting your test. This avoids being stuck in a bad experiment which nobody cares about.

  • Follow PICOT with discipline: Don’t skip steps even if it gets boring. It has to be done systematically whether in interview or in real life.

  • Show industry knowledge: Mentioning Meta’s AR/VR and AI investments isn’t name-dropping—it’s demonstrating product sense. Know what the company is betting on and align your experiments accordingly.

  • Bring real experience: When I mentioned “sizing concerns are one of prime reasons for cart abandonment for me personally,” that’s authentic. Share your actual observations from running experiments or being a user. Interviewers want to see you’ve lived this, not just studied it.

  • Question everything: Always discuss novelty effect, primacy bias, and other confounding factors. Say things like “The 3% CTR increase could be novelty effect, so I’d monitor for 4+ weeks to see if it sustains.” This shows maturity in interpreting results.

Common Pitfalls to Avoid

  • Jumping to one solution: Going straight to “I’ll test AR ads” without exploring alternatives shows narrow thinking. Always brainstorm multiple options first.

  • Being too theoretical: Don’t recite textbook definitions. This isn’t an academic exercise—they want to know you can run real experiments. Connect to actual products and experiences.

  • Ignoring trade-offs and biases: Not mentioning novelty effect, selection bias, or interference between test groups suggests you haven’t actually run experiments at scale.

Practice This Case

Want to try this A/B testing case yourself with an AI interviewer that challenges your experimental design and provides detailed statistical feedback?

Practice here: PM Interview: A/B Testing - Facebook Ads Revenue

The AI interviewer will push you on your statistical assumptions, challenge your business logic, and test whether you can balance multiple stakeholder needs—just like a real Facebook ads team PM would.

Further Reading

Explore more A/B testing resources I’ve created:

PM Tool of the Week: Mixpanel

As PMs running A/B tests, we need analytics that actually work. This week, I’m sharing Mixpanel—been using it since Fitso, now at Tough Tongue AI.

Here’s why I like it:

  • User-level tracking: See exactly which users did what in your test vs control groups
  • Built-in experiments report: Automatic lift and confidence calculations—no Excel gymnastics
  • Instant segmentation: Slice test results by any property to find hidden patterns

For years, Mixpanel was the only tool doing user-level analytics while Google Analytics was stuck in aggregate-land. Still my go-to for A/B test analysis.

Got an analytics tool you swear by? Hit reply and tell me about it!


What experiments would you run differently? What unique angles would you bring to Facebook’s ads revenue challenge? Hit reply—different perspectives on these analytical challenges always teach me something new!


About PM Interview Prep Weekly

Every Monday, get one complete PM case study with:

  • Detailed solution walkthrough from an ex-Google PM perspective
  • AI interview partner to practice with
  • Insights on what interviewers actually look for
  • Real examples from FAANG interviews

No fluff. No outdated advice. Just practical prep that works.

— Ajitesh
CEO & Co-founder, Tough Tongue AI
Ex-Google PM (Gemini)
LinkedIn | Twitter


Related Posts