6.7mediumCONDITIONAL GO

Excel-to-Warehouse Migration Copilot

Tool that analyzes a company's existing Excel reports and automatically generates the equivalent dbt models and dashboard specs for a proper data warehouse.

DevToolsGrowing companies (100-1000 employees) transitioning from ad-hoc Excel report...
The Gap

Companies stuck in the 'Excel zone' have years of tribal knowledge embedded in spreadsheets. Migrating this to a real DW means manually reverse-engineering every spreadsheet's logic, which is tedious and error-prone.

Solution

Upload or connect your Excel/Google Sheets reports. The tool parses formulas, pivot tables, and data flows to generate: a catalog of existing metrics and KPIs, suggested dimensional models, draft dbt SQL models, and BI dashboard wireframes that replicate current reports. Bridges the gap between 'Excel zone' and modern data stack.

Revenue Model

Per-project pricing: $500-2000 per migration project based on number of spreadsheets analyzed. Optional ongoing subscription for drift detection between Excel and warehouse reports.

Feasibility Scores
Pain Intensity8/10

This is a real, visceral pain point. Data teams routinely spend 2-6 months manually reverse-engineering spreadsheets during warehouse migrations. The Reddit thread and countless similar posts confirm this. The pain is acute during a specific transition moment — it's not chronic, but when it hits, it's the #1 priority.

Market Size6/10

TAM is meaningful but bounded. Target is companies 100-1000 employees actively transitioning to a data warehouse — maybe 50K-100K companies globally at any given time. At $1K avg project price, that's $50M-100M TAM. Not venture-scale, but excellent for a bootstrapped or small-team product. The 'per-project' nature limits recurring revenue unless drift detection upsells work.

Willingness to Pay7/10

Companies currently pay $30K-150K+ for consulting to do this manually. A tool at $500-2000 per project is a 10-50x cost reduction — the value prop is obvious. Data teams have budget. The risk: buyers may see this as a one-time purchase, not ongoing spend. Also, some may try the DIY-with-ChatGPT route for free.

Technical Feasibility5/10

This is the hardest part. Parsing Excel formulas (including VLOOKUP chains, nested IFs, pivot tables, VBA macros, cross-workbook references, named ranges) is genuinely complex. Real-world spreadsheets are messy — merged cells, implicit type conversions, hardcoded values mixed with formulas. An MVP that handles 60-70% of common patterns is achievable in 6-8 weeks by a strong solo dev with LLM assistance, but the long tail of edge cases is brutal. Generating correct dbt models with proper joins and grain is non-trivial.

Competition Gap9/10

This is the strongest signal. No product exists that does end-to-end Excel logic → dbt model generation. Current options are: expensive consultants, one-off LLM prompts, or DIY. The gap is wide open. The risk is that dbt Labs or a well-funded data tooling company builds this as a feature, but they haven't yet.

Recurring Potential4/10

The core use case is a one-time migration — companies don't migrate from Excel every month. The 'drift detection' subscription idea is creative but weak: once you've migrated, you want to kill the spreadsheets, not monitor them. Recurring revenue would need to come from adjacent features (ongoing documentation, model testing, new spreadsheet onboarding for acquisitions) or a platform pivot. This is the biggest business model weakness.

Strengths
  • +Massive competition gap — no direct product competitor exists for automated Excel-to-dbt migration
  • +Clear 10-50x cost reduction vs. consulting alternatives, making ROI argument trivial
  • +Timing is excellent: LLMs make formula parsing newly feasible, and modern data stack adoption is accelerating
  • +Specific, well-defined buyer persona (data/analytics lead at growing company) with identifiable purchase trigger (exec mandate for real reporting)
Risks
  • !One-time purchase dynamics make revenue lumpy and recurring revenue hard to achieve — this is a project tool, not a platform
  • !Excel parsing edge cases are a deep technical rabbit hole — real-world spreadsheets are far messier than demos suggest (VBA, macros, circular refs, hardcoded values)
  • !LLM-assisted DIY (paste formulas into ChatGPT) is a free competitor that's 'good enough' for some teams
  • !dbt Labs, Snowflake, or a well-funded startup could build this as a feature if the category gets validated
  • !Accuracy expectations will be sky-high — if the generated dbt models produce different numbers than Excel, trust evaporates instantly
Competition
Equals.app

Spreadsheet interface connected directly to data warehouses

Pricing: ~$49/user/month, team plans available
Gap: It's a bridge, not a migration tool. Doesn't reverse-engineer existing Excel logic, generate dbt models, or help you actually move off spreadsheets — it just makes spreadsheets smarter.
Flatfile

Data onboarding platform that imports CSV/Excel data into applications and databases with validation, mapping, and transformation rules.

Pricing: Free tier; paid plans from ~$500/month
Gap: Only moves the DATA, not the LOGIC. Ignores formulas, pivot tables, and business rules entirely. It's a data ingestion tool, not a logic migration tool.
Fivetran / Airbyte (Google Sheets & Excel connectors)

ELT platforms with connectors that pull raw data from Google Sheets or Excel files into a data warehouse on a schedule.

Pricing: Fivetran: usage-based from ~$1/credit; Airbyte: open-source or cloud at ~$1+/credit
Gap: Moves cell values only — completely ignores formulas, business logic, KPI definitions, and report structure. You still need to manually rebuild everything in dbt.
Data Consultancies (Brooklyn Data Co, dbt Labs PS, Analytics8)

Professional services firms that manually reverse-engineer spreadsheet logic, design dimensional models, and build dbt projects as consulting engagements.

Pricing: $150-350/hour; typical migration projects run $30K-150K+
Gap: Extremely expensive and slow (weeks to months). Doesn't scale. No reusable tooling — every engagement starts from scratch. Out of reach for mid-market companies.
ChatGPT / Claude (manual LLM-assisted conversion)

Users paste individual Excel formulas into LLMs and ask for SQL or dbt equivalents. Some lightweight wrappers and GPT plugins exist for this workflow.

Pricing: ~$20-100/month for API/subscription access
Gap: No systematic approach — one formula at a time with no context. Cannot track cell dependencies, cross-sheet references, or data lineage. No automated project generation. Cannot understand the spreadsheet as a whole system.
MVP Suggestion

A web app where users upload 1-5 Excel workbooks. The tool: (1) extracts and catalogs all formulas, named ranges, and cell dependencies into a visual DAG, (2) identifies metrics/KPIs (cells that look like final outputs), (3) generates draft dbt SQL models with inline comments explaining the original Excel logic, (4) outputs a data dictionary. Skip dashboard wireframes for MVP. Skip pivot table parsing initially. Focus on formula-heavy workbooks with SUMIFS, VLOOKUPs, and IF-chains — that's where 80% of the value is. Ship as a CLI + simple web UI.

Monetization Path

Free tier: analyze 1 workbook, show the dependency DAG and metric catalog (lead gen + wow factor). Paid: $500 for up to 10 workbooks with full dbt model generation. Pro: $2000 for unlimited workbooks + Slack support + revision cycles. Long-term: pivot toward 'spreadsheet intelligence platform' — ongoing monitoring of new spreadsheets created in the org, automatic dbt model suggestions, spreadsheet→warehouse governance layer. Consider white-labeling to data consultancies as a 10x productivity tool for their migration engagements.

Time to Revenue

8-12 weeks to MVP with first paying customer. The formula parsing engine and dbt code generation are the technical bottleneck. Recommend building with 2-3 design partners (real companies mid-migration) to validate output quality before charging. First revenue likely from a pilot at $500, scaling to $2K+ as accuracy improves.

What people are saying
  • deep in the Excel zone
  • almost zero reporting
  • Sudden surge of demand from top execs for full company reporting solutions
  • Figure out what questions the business actually needs answered first