H-1B Fraud Detector

The Gap

Shell companies with fake addresses, sequential tax IDs, and duplicate petitions are gaming the H-1B lottery, diluting chances for legitimate applicants and employers. Detection is currently manual and investigative.

Solution

A SaaS tool that cross-references USCIS petition data, IRS EIN registries, state LLC filings, and address validation APIs to flag suspicious patterns — sequential tax IDs, shared addresses, overlapping officers, fake zip codes, and bulk filings from newly formed entities.

Revenue Model

Subscription — tiered pricing for law firms (per-seat), enterprise for large employers, and API access for government/compliance teams

Feasibility Scores

Pain Intensity7/10

The pain is real but unevenly distributed. For USCIS compliance teams and advocacy orgs, this is a burning need — the Reddit post detailing 576 fake petitions from connected shell companies shows investigators spending weeks on manual cross-referencing. For immigration law firms, it's a 'nice to have' for due diligence on competing petitions or client vetting. The pain signal is genuine (manual investigative work, sequential tax IDs, fake zip codes) but the number of people who feel this pain acutely enough to pay is narrower than it first appears.

Market Size5/10

This is a niche within a niche. TAM calculation: ~5,000 immigration law firms in the US × maybe 20% care about fraud detection = 1,000 firms × $200/mo = $2.4M ARR from law firms. Add corporate immigration depts at Fortune 500 (~200 serious buyers × $1K/mo) = $2.4M. Government contracts could be large but take years. Journalists/advocacy orgs have tiny budgets. Realistic serviceable market is $5-10M ARR ceiling. That's a solid lifestyle business or a small venture, not a venture-scale opportunity unless you expand the platform.

Willingness to Pay5/10

Mixed signals. Immigration law firms pay $79-199/mo for case management tools (Docketwise) — fraud detection is adjacent but not core workflow. Corporate immigration teams at large employers already spend $50K-500K/year on compliance platforms but fraud detection by external entities isn't their primary concern. Government procurement is high-value but brutally slow (12-18 month sales cycles). Journalists and advocacy orgs are mostly price-sensitive. The willingness exists in pockets but isn't widespread yet — you'd need to create the category.

Technical Feasibility8/10

Highly feasible for a solo dev MVP. Core data sources are public: DOL LCA disclosure files, USCIS employer data hub, state LLC registration databases (many have APIs or bulk downloads), and USPS address validation. Pattern detection (sequential EINs, shared addresses, bulk filing detection) is straightforward data engineering — no ML required for v1. The hardest part is stitching together data from heterogeneous state-level sources, but you can start with the top fraud states (Texas, New Jersey, California). A competent full-stack dev could build a functional MVP in 4-6 weeks.

Competition Gap9/10

This is the strongest dimension. There is literally no direct competitor. Nobody has built a purpose-built H-1B fraud detection tool. The closest adjacencies are either raw data lookups (MyVisaJobs, H1BGrader) with no analysis, enterprise entity verification tools (Sayari, OpenCorporates) with no immigration awareness, or internal government systems (FDNS/ATLAS) that are not public. You would be creating a new category. First-mover advantage is significant because whoever builds the canonical database of flagged entities first creates a compounding data moat.

Recurring Potential7/10

Subscription model works because the underlying data changes with every H-1B season (annual lottery cycle in March, quarterly LCA disclosures). Customers would need ongoing access to fresh flags and updated entity analysis. The H-1B cycle creates natural annual renewal triggers. However, some users (journalists) may only need access episodically, and law firms might only check during filing season. You'd need to add continuous monitoring features (alerts when new suspicious entities appear) to drive monthly engagement.

Strengths

+Zero direct competition — you'd be creating a new category with genuine first-mover advantage
+Problem is government-validated: USCIS changed lottery rules specifically because of the fraud you'd detect
+Public data sources make MVP technically feasible without expensive data licensing
+Strong narrative and media angle — journalists are already writing about this exact problem, which creates organic marketing
+Compounding data moat: the longer you run, the more historical patterns and flagged entities you accumulate

Risks

!Small addressable market — this may be a $5-10M ARR ceiling business, not venture-scale. Be honest about whether that matches your ambition.
!Government customers are high-value but have 12-18 month procurement cycles that can kill a bootstrapped startup's runway
!Legal liability risk: incorrectly flagging a legitimate company as fraudulent could invite defamation claims. You need careful disclaimers and a 'risk scoring' approach rather than definitive fraud labels.
!USCIS could build this internally or improve their public tools, wiping out your value prop overnight
!Data quality risk: state-level LLC databases are inconsistent, some lag months behind, and EIN data is not fully public — your analysis is only as good as your data sources

Competition

MyVisaJobs

Aggregates public H-1B LCA and PERM data. Users can search employers, salaries, job titles, and approval rates from DOL disclosure data.

Pricing: Free with ads; premium ~$30-50/month for job seekers

Gap: Zero fraud detection. No shell company flagging, no pattern analysis, no entity verification. Purely informational — shows data but draws no conclusions about legitimacy.

H1BGrader

Grades H-1B sponsors based on approval rates, salary data, and petition volumes from public DOL/USCIS data.

Pricing: Free

Gap: No analytical depth whatsoever. No fraud indicators, no entity cross-referencing, no network analysis. A lightweight consumer tool, not a compliance or investigative platform.

Sayari Analytics

Corporate entity graph analytics platform that maps ownership structures, identifies shell companies and beneficial owners. Used by financial crime and AML teams.

Pricing: Enterprise only; $50K+/year

Gap: No immigration-specific integration whatsoever. Cannot correlate entity data with H-1B petition filings, LCA data, or USCIS patterns. Would require significant custom work to apply to visa fraud — and at $50K+/year, inaccessible to law firms and journalists.

OpenCorporates

World's largest open database of corporate entities with API access for entity verification across jurisdictions.

Pricing: Free tier; API plans $500-5,000/month

Gap: Not immigration-aware at all. No H-1B linkage, no petition data, no fraud scoring. You could manually cross-reference their data with USCIS filings, but there's no product doing that synthesis. Raw ingredient, not a finished dish.

USCIS H-1B Employer Data Hub

Government-published dataset showing petition counts, approval/denial rates, and basic employer info for H-1B sponsors.

Pricing: Free (public government data

Gap: Extremely basic — no entity verification, no cross-referencing with state LLC filings or IRS data, no pattern detection, no fraud indicators. Raw data dump with a minimal search interface. Requires technical skill to extract any insight beyond basic lookups.

MVP Suggestion

A web app with three core features: (1) Employer Risk Score — enter any H-1B sponsor name and get an instant fraud risk score based on public data cross-referencing (entity age vs. petition volume, address validation, officer overlap with other filers); (2) Pattern Alerts — a dashboard showing newly detected suspicious clusters (sequential EINs, shared addresses, bulk filings from entities formed in the last 6 months); (3) Entity Graph — visual network map showing connections between related companies, shared officers, and shared addresses. Start with data from 3-5 top fraud states (TX, NJ, CA, IL, NY). Use DOL LCA disclosures + state LLC databases + USPS address validation API. No ML needed for v1 — rule-based pattern detection is sufficient and more explainable.

Monetization Path

Free tier: basic employer lookup with limited risk signals (hooks journalists and researchers, generates SEO content) → Pro ($149/mo): full risk scores, pattern alerts, entity graphs, export capabilities (targets immigration law firms) → Enterprise ($500-2,000/mo): API access, custom monitoring, bulk analysis, priority data freshness (targets corporate immigration departments) → Government ($50K+/year): dedicated instance, enhanced data feeds, custom integrations, SLA (targets USCIS/DOJ, likely requires FedRAMP journey)

Time to Revenue

8-12 weeks to first dollar. Weeks 1-6: build MVP with employer risk scoring and basic pattern detection. Weeks 6-8: beta with 5-10 immigration lawyers from LinkedIn outreach and immigration forums. Weeks 8-12: launch Pro tier, target first 10-20 paying customers from immigration law firm communities. Government revenue is 12-18 months out minimum.

What people are saying

“Sequential tax IDs across 10 shell companies”
“Fake zip codes used in registrations”
“576 fake petitions from connected entities”
“Same typo repeated across filings suggesting batch generation”
“Owners listed as HR managers to hide true ownership”
“Address doesn't exist — registered at nonexistent zip code”

H-1B Fraud Detector

More in Finance

MedBill Advocate

ExclusionGuard

DevArtifact

AR Autopilot for Small Teams