Build an AI Data Analysis Agent in Codex
A step-by-step framework for using Codex as an AI data analyst — from raw dataset to root-cause finding to leadership-ready deck in 30 minutes.
Create, optimize, and measure content that actually drives leads with Content Hub.
Your VP wants to know why retention dropped last week. It's Friday at noon. You have two hours and a CSV file.
In the past, that meant pivot tables, gut feel, and a hypothesis you weren't confident in. Today, it means opening Codex, directing an AI data analyst with one well-structured prompt, and walking into that meeting with a root-cause analysis, a validated finding, and a deck you built in under 30 minutes.
This guide shows you how to do exactly that — step by step.
Before You Start
1. Pick your tool — use your company's approved version.
Codex is the recommended tool for this workflow but if you or your company uses a different tool, the prompt framework works well with these options:
The recommended tool for this workflow. Codex runs locally on your computer, meaning it reads files stored directly on your desktop rather than through a browser upload. That matters for two reasons: it's faster for multi-file analysis, and your data never leaves your machine. If you have a ChatGPT subscription, you can download Codex and be set up in under 10 minutes.
Regardless of which tool you use: if your company has an enterprise subscription, use that. Do not upload business data into a personal free-tier account.
2. Have a specific problem, not a vague one.
"What's wrong with my data?" produces nothing. "Why did our MQL-to-SQL conversion rate drop from 34% to 21% between Q1 and Q2?" produces a root-cause analysis. The more specific the question, the sharper the output.3. Bring more data than you think you need.
If a segment only has 12 records, the AI will find patterns in it — and those patterns will be meaningless. More rows, better analysis.~10 minutes
Step 1: Clean your data export
- Rename columns to plain English. opp_stg_v2 → Opportunity Stage. cam_src → Campaign Source. The AI reads column headers as instructions. Make them legible.
- Delete columns irrelevant to your question. Analyzing MQL-to-SQL conversion? You don't need the billing address field. A focused dataset produces a focused analysis.
- Standardize your date formats. Mixed formats across rows will cause the AI to misread time-series patterns. Pick one format, apply it to every date column.
- Remove obvious junk rows. Test leads, internal email domains, records missing the key metric you're analyzing. Don't sanitize the whole dataset — just cut the noise that would distort a trend.
💡 Pro Tip: Check for missing values before you run anything. If your data has gaps in key columns, the AI will work around them silently and give you conclusions based on incomplete information. Ask it upfront: "Are there any missing values or data quality issues in this file I should know about before we start?"
~5 minutes
Step 2: Write a root-cause prompt
The wrong prompt sounds like, "What insights can you find in this data?"
A prompt that’s too open-ended invites generic output. What you get back is a summary of what's in the dataset — not an explanation of why something happened.
A root-cause prompt does three things: gives the AI business context it doesn't have, specifies the outcome you're trying to explain, and asks it to reason about causation — not just flag correlations.
I'm a [role] at a [company type]. I'm trying to understand why [specific metric] changed by [specific amount] between [time period A] and [time period B].
Context: — [One sentence on how your funnel or product works] — [One sentence on what you already know or suspect] — [Any segments or dimensions you want the analysis to focus on]
Using the uploaded CSV, please:
- Identify the top 2–3 factors most strongly associated with this change
- For each factor, explain why it might be driving the outcome — not just that it correlates
- Flag any data quality issues or gaps that limit your confidence
- Tell me what additional data would strengthen or change the analysis
Do not give me general marketing advice. Stay close to what's actually in the data.
I'm a marketing manager at a B2B SaaS company. I'm trying to understand why our MQL-to-SQL conversion rate dropped from 34% to 21% between Q1 and Q2.
Context: we generate MQLs through inbound content and paid LinkedIn. SQLs are determined by SDR qualification calls. We ran a new campaign in Q2 targeting mid-market healthcare companies. I want to know if the drop is across all lead sources or concentrated in specific channels.
[Then the four numbered asks above.] Do not give me general marketing advice. Stay close to what's in the data.
That last line matters. Without it, AI tools default to generic recommendations — "improve lead nurturing," "increase content frequency" — that could apply to any company. You want findings from your data.
💡 Pro Tip: If you're using Codex or Claude Code: create a project folder on your desktop, save your CSV there, and point the tool at that folder before you run anything. This is what "working locally" means in practice — the AI reads your files directly rather than through a browser upload. It also means your outputs (cohort analysis files, charts, draft decks) get saved back to that same folder automatically.
~5 minutes
Step 3: Read the first output like a skeptic
Upload your cleaned CSV, paste your prompt, hit send. Here's how to read the output efficiently:
- Read conclusions first, reasoning second. If a finding immediately contradicts something you already know, that's important signal — note it, don't ignore it.
- Pay attention to what it didn't find. No difference across lead source channels? That tells you the problem is probably not channel-specific — which narrows your investigation considerably.
- Take confidence qualifiers seriously. "This correlation is weak" or "the sample size for this segment is small" means that finding needs human verification before you act on it.
- Don't treat the first output as the final answer. It's the start of the analysis. The next step is where the real value comes from.
~5 minutes
Step 4: Push back with follow-up prompts
Use these follow-up prompts to pressure-test the findings.
"You said X is a likely driver. Walk me through the specific rows or segments in the data that support that conclusion."
What good looks like: the AI points to a specific cohort, date range, or segment where the pattern is clearest. If it can't show you the data behind the finding, treat the finding as unconfirmed.
"Segment this by [company size / region / lead source / time period] and tell me if the pattern holds across all groups or only in specific ones."
What good looks like: the AI returns a breakdown that either confirms the pattern is consistent across segments (stronger finding) or shows it's concentrated in one group (which is often the more actionable insight — you now know exactly where to focus).
"You flagged a data quality issue with [column]. If that column were excluded entirely, how would your top conclusions change?"
What good looks like: the AI either confirms the conclusions hold without that column (increasing confidence) or revises them significantly (meaning your finding was partially dependent on unreliable data — important to know before you act).
"Rank these findings by how confident you are in each one. For the top finding, tell me what's driving your confidence. For the lowest, tell me what's limiting it."
What good looks like: a clear tiered list that separates what's well-supported from what's directional. Use the top finding to drive your recommendation. Use the lowest-confidence finding to define your next data pull.
"Break this data down by [week / month / quarter] and tell me if the trend is accelerating, decelerating, or inconsistent. Flag any periods that look like outliers."
What good looks like: a clear directional read on whether the problem is getting worse, stabilizing, or was a one-time event. Knowing the trajectory changes the urgency of the response.
"I suspect [X] is driving [Y]. Does the data support or contradict that? Walk me through what you find."
What good looks like: the AI either validates your hypothesis with supporting evidence or pushes back with data that contradicts it. Either outcome is useful — confirmation saves time, contradiction saves you from making a bad decision.
~5 minutes
Step 5: Validate before you act
Does this finding match what people in your organization actually observed? If the AI says LinkedIn leads had a 70% higher qualification rate in Q1 but your SDRs have been complaining about LinkedIn lead quality since February, something is off. Talk to a human close to the data before you act.
~2 minutes
Step 6: Turn your findings into a deck
Once you have a validated finding, you don't need to manually build a presentation. Ask Codex or Claude Code to do it directly from the analysis output.
"Based on this analysis, create a PowerPoint presentation structured for a leadership audience. Include: the business question we were investigating, the key finding, the data that supports it, confidence level, and recommended next action."
What good looks like: a structured deck with one finding per slide, data visualizations pulled from the analysis, and a clear recommendation slide at the end. You'll want to edit and add branding — but the structure and content are done.
After 30 minutes, you'll have one of three things
- A confirmed finding — enough evidence to brief your team or change a decision. Write it up in one paragraph: what changed, what's driving it, what you're doing about it.
- A strong hypothesis — plausible and supported by the data, but not fully validated. Identify the one piece of evidence that would confirm or kill it. Go get it.
-
A data gap — you've learned your current data isn't sufficient to answer the question. That's still a valuable outcome. Now you know exactly what to instrument or ask RevOps to pull.