Your boss drops a spreadsheet in your lap. Forty columns, three years of data, inconsistent date formats, a column called "FINAL_v3_USE_THIS," and a tab named "DO NOT DELETE." You have a meeting in two hours. AI data analysis prompts are your fastest way out of this situation, and most people are using them wrong.
Wrong means: pasting raw data into ChatGPT and hoping it figures out what you actually need. It won't. Not because AI is dumb, but because AI is fast, confident, and has no idea what your boss actually cares about. That gap between confident-sounding output and correct output is where decisions go to die.
This article gives you 10 copy-paste prompts you can use today, a reusable formula for every analysis task, and the guardrails to stop you from trusting a hallucinated benchmark in front of your CFO.
Before you touch a prompt: the one rule that matters
AI analysis works exactly like hiring a smart intern who reads everything instantly, remembers nothing about your business, and will confidently answer a question you didn't ask.
Your job is to give that intern three things before they touch the data:
- The dataset structure (columns, row count, what each row represents)
- The business question in plain English
- The output format you actually need
Skip any of those and you'll get something that looks like an answer but isn't. This isn't a flaw to work around. It's the entire model. Garbage in, garbage out. The better your setup, the better your output.
The reusable formula: Act as [role]. I have a dataset with [describe columns, row count, grain]. My business question is: [specific question]. Please [specific task]. Format the output as [table / bullet list / SQL / Python / plain English]. Flag any assumptions you make.
That last sentence, "flag any assumptions," is the one most people leave out. It's the most important one.
A word on data safety before anything else. Do not paste confidential customer records, employee salaries, health data, unreleased financial results, or regulated information into any AI tool unless your company has an explicit policy that permits it. Anonymize first. Use fake IDs. Paste structure and sample rows, not the whole damn file. This isn't paranoia. It's basic professional hygiene, and it's on you, not the AI.
10 AI data analysis prompts you can use right now
Prompt 1: Understand what you're even looking at
You've got the file. You don't know what it contains. Start here.
Act as a data analyst. I have a dataset with the following columns: [paste column names and a few sample rows]. Each row represents [one customer / one transaction / one month of sales, etc.]. Please: (1) describe what this dataset likely tracks, (2) identify any columns that look ambiguous or misnamed, (3) flag any obvious data quality issues from the sample, (4) suggest three questions this data could answer. Flag any assumptions you make.
This prompt stops you from jumping straight to analysis before you understand the shape of what you're working with. It takes two minutes and saves you from building a chart on top of a column that means something completely different than you thought.
Prompt 2: Clean messy columns
Inconsistent capitalization. "N/A" and "null" and blank cells all meaning the same thing. Dates in three formats. Every analyst's Tuesday.
Act as a data cleaning assistant. This column contains [paste column name and 20 sample values]. Values should represent [what this column is supposed to contain]. Please: (1) identify the distinct messy patterns, (2) suggest a cleaning rule for each pattern, (3) write an Excel formula or Python/pandas one-liner to apply each fix. List any values you're unsure about rather than guessing.
The "list what you're unsure about" instruction matters. Without it, AI will make a call on every edge case and not tell you it did. You want it to surface the ambiguous ones so a human (you) makes that judgment call.
Prompt 3: Build spreadsheet formulas
This is where AI genuinely earns its keep. Formula writing is tedious, error-prone, and the kind of task that eats 45 minutes when it shouldn't. Our AI spreadsheet prompts guide goes deeper on this, but here's the core data analysis version:
I have an Excel / Google Sheets file. Column A is [description], Column B is [description], Column C is [description]. I want to calculate [exactly what you want to calculate] and put the result in Column D. The formula should [handle blanks / ignore zero values / look up from another sheet called X]. Write the formula and explain each part. Flag if there's more than one way to do it.
Paste the formula into a test row first. Check it against three rows you can verify manually. Always verify row counts and filters before you trust a formula on a full dataset.
Prompt 4: Write SQL for a specific question
Act as a SQL analyst. I have a table called [table_name] with these columns: [list columns with data types]. I want to find [specific business question, e.g., "the top 10 customers by total revenue in Q1 2024, excluding refunded orders"]. Write a SQL query that answers this. Use standard SQL. Add comments explaining each clause. Flag any joins or filters I should double-check before running this in production.
Always run AI-generated SQL on a sample or dev environment first. Check the row count. Check that your joins aren't multiplying rows. Check that your date filters are doing what you think they're doing. AI will write syntactically correct SQL that answers the wrong question with more confidence than you'd like.
Prompt 5: Write Python / pandas steps
Act as a data analyst using Python and pandas. I have a DataFrame called df with these columns: [list columns and dtypes if you know them]. I want to: [numbered list of transformation steps]. Write the pandas code for each step. Use comments. After the code, list any assumptions you made about the data types or nulls, and suggest one sanity check I should run after each transformation.
The sanity check request is the move here. Something like "print the row count before and after the merge" or "check for duplicates on the key column" catches the silent errors that wreck downstream analysis.
Prompt 6: Check summary statistics
You've run your analysis. Now you want to know if the numbers smell right before you put them in a slide.
I ran a summary of [dataset description]. Here are the results: [paste summary stats: mean, median, min, max, row count, null count per column]. Does anything look suspicious? What would a reasonable range be for each metric given this is [context: e.g., e-commerce orders for a mid-size retailer]? List anything worth investigating before I present these numbers.
Notice what this prompt is not doing: it's not asking AI to invent benchmarks. It's asking for a smell test on your specific numbers. If the AI starts citing industry benchmarks you can't verify, don't use them. Ask it to flag the concern without making up a comparison point.
Prompt 7: Find suspicious outliers
Act as a data quality reviewer. Here are rows from my dataset that appear to be outliers: [paste 10-20 suspect rows]. The column in question is [column name] and it should represent [what it represents]. For each row: (1) explain why it might be an outlier, (2) suggest whether it looks like a data error, a real edge case, or something that needs manual review. Do not delete or replace values. Just flag and explain.
The human decides what to do with outliers. The AI helps you see them faster and think through the options. That's the right division of labor.
Prompt 8: Explain a chart in plain English
You made a chart. Your boss won't read it unless there's a sentence underneath it that explains exactly what to see. This is where AI is genuinely useful and where a lot of people are already saving time.
Dee covers this kind of task-level thinking in Don't Replace Me, where the point is simple: use AI for the parts of work you hate or the parts that slow you down. Writing chart captions at 6pm is exactly that.
Here is a description of a chart I made: [describe chart type, axes, what data it shows, the key trend or pattern you see]. Write three versions of a plain-English caption that explains what the chart shows and why it matters. Audience: [your specific audience, e.g., non-technical marketing team]. Keep each caption under 40 words. Do not editorialize or attribute causation unless the data clearly supports it.
Prompt 9: Turn findings into a decision memo
This is where the work becomes useful. Raw findings aren't decisions. Someone has to translate them. AI can draft that translation fast, and you clean it up.
Act as a business analyst writing for senior leadership. Here are my key findings from a data analysis: [bullet point your 3-5 main findings, including the numbers]. The original business question was: [restate the question]. Please write a one-page decision memo with: (1) context, (2) key findings, (3) what the data suggests, (4) one recommended next step with a rationale. Flag anywhere the data doesn't fully support the recommendation so I can decide how to handle it.
That last instruction keeps AI from overstating your conclusions. The memo draft is a starting point, not a finished product. Read every sentence and ask: does my data actually support this? If not, cut it or qualify it. For research and synthesis tasks like this, the approach in our desk research prompts guide applies directly.
Prompt 10: Check whether your analysis supports the conclusion
This one's uncomfortable, which is why it's the most useful.
I'm about to present the following conclusion: [state your conclusion]. Here is the analysis I used to reach it: [describe or paste your key numbers, method, and assumptions]. Please play devil's advocate. What are the three most credible ways someone could challenge this conclusion? What data or information would I need to make the conclusion airtight? What assumptions am I making that might not hold?
You're not outsourcing your judgment here. You're stress-testing it before someone else does it for you in a meeting. The prompts guide for work has a similar adversarial prompting pattern for other contexts.
What AI data analysis prompts can't do for you
AI can write the formula. It can't tell you whether the metric you chose is the right one to care about. That distinction matters more than any prompt.
What AI can and can't do gets into this properly, but the short version for analysis work is: AI is excellent at transformation, formatting, formula writing, and explaining patterns in data you provide. It's bad at knowing whether your business question is the right one, whether a correlation is causal, whether a benchmark it cites is real, or whether the tradeoff your analysis implies is actually acceptable to your organization.
Those are judgment calls. They belong to you. That's not a limitation to resent. It's the reason you still have a job.
The quality checklist before you trust anything AI generates:
- Row counts match before and after joins or filters
- Date ranges are correct and inclusive/exclusive as intended
- Denominators are what you think they are
- Units are consistent (are you mixing monthly and annual figures?)
- Null handling is explicit, not assumed
- Any benchmarks or external numbers come from sources you can verify
- Formulas have been tested on rows you can manually confirm
One more time on data privacy: no customer PII, no employee records, no health data, no unreleased financials, no regulated data. Anonymize. Paste structure and sample. Keep your job.
This came from a book.
Don't Replace Me
200+ pages. 24 chapters. The honest version of what AI means for your career, written by someone who actually builds this stuff.
Get the Book →Frequently asked questions
Can I use ChatGPT or Claude to analyze a spreadsheet?
Yes, and it works well if you give it the column names, a description of what each row represents, and a specific question. Don't paste the whole file. Paste the structure and a sample of rows. Both ChatGPT and Claude can write formulas, spot data quality issues, and help you build a summary. Verify everything against your source data before you use it in a report.
What's the best way to write an AI prompt for data analysis?
Start with a role, describe your dataset (columns, row count, what each row represents), state your business question in plain English, specify the output format, and ask it to flag any assumptions it makes. That five-part structure covers most of what AI needs to give you something useful rather than something that looks useful.
Is it safe to paste data into ChatGPT for analysis?
Only if your company policy explicitly permits it and the data isn't confidential, regulated, or personally identifiable. For most work data, anonymize first: replace names with IDs, remove email addresses, and use fake values for anything sensitive. When in doubt, paste only the column structure and a few fake sample rows, then describe the real data in plain text.
Can AI write SQL queries for me?
Yes, and it's reasonably good at it. Give it the table name, column names with data types, and the exact question you're trying to answer. Always test AI-generated SQL on a sample or non-production environment first. Check row counts, verify joins aren't multiplying records, and confirm date filters are doing what you think.
Will AI make up statistics or benchmarks in my analysis?
It can, and it will do so confidently. Never ask AI to supply benchmarks, industry averages, or external comparisons unless you can verify them independently. Ask it to flag where it's making assumptions rather than stating facts. Treat any number that didn't come from your own dataset as unverified until you find a primary source.
What tasks is AI actually useful for in data analysis?
Writing formulas, generating SQL, cleaning messy columns, explaining what a chart shows, drafting summary memos, and checking for data quality issues are all genuinely useful. Deciding which metric matters, interpreting whether a pattern is meaningful for your business, and making the final recommendation are still on you. AI speeds up the mechanical parts. The judgment parts are your job.