Download the Data
All datasets are compressed zip files — faster to download, smaller to store. Open in Excel, Google Sheets, Python, or any data tool. No sign-up. No paywall.
Complete Dataset
OnlyNerds Complete Data
Everything in one archive — H1B companies, private market firms, VC portfolios, USA universities, and world universities. The fastest way to get all the data.
Contains: startups_h1b_database.csv · Privately_Listed_Companies.csv · Portfolio.csv · usa-universities.csv · world-universities.csv
Download Complete Data (2.9 MB)Individual Files
USA Universities
2,000+ accredited US universities and colleges. Institution name, state, and official website. Sourced from publicly available academic directories.
World Universities
9,000+ universities across 200+ countries including all US institutions. Country code, institution name, and official website. Sourced from public academic registries.
CSV Schemas
startups_h1b_database.csv
| Column | Values |
|---|---|
| Company Name | String |
| Business Sector | String |
| Category | String |
| Tags | Comma-separated string |
| H1B Sponsorship Likelihood | Very High / High / Medium / Low / Unknown |
| Fortune 500 | Yes / No |
| Fortune 1500 | Yes / No |
| Boutique | Yes / No |
| AnalystsPick | Yes / No |
| Publicly traded | Yes / No |
Privately_Listed_Companies.csv
| Column | Values |
|---|---|
| Company_Name | String |
| Sector | String |
| Funding_round | Pre-Seed / Seed / Series A–F / Growth / Late Stage / Private Equity |
| Amount_raised | String (e.g. $50M) or -- |
| Sub-sector | String |
| Source Website | URL |
Portfolio.csv
| Column | Values |
|---|---|
| Name | String |
| Type | VC / accelerator |
| Portfolio | URL |
| JobBoard | URL or N/A |
usa-universities.csv / world-universities.csv
| Column | Values |
|---|---|
| Country Code | ISO 3166-1 alpha-2 (e.g. US, IN, DE) |
| University Name | String |
| Website | URL |
Notes
Public records only
All data sourced from publicly available records - USCIS filings, company career pages, LCA disclosures, and public academic registries. No paywalled or scraped data.
Licensed Apache 2.0
Free to use, fork, and build on. Attribution appreciated. See the GitHub repo for the full license. Do not re-sell the data as-is.
Contribute updates
If you find outdated entries, open a GitHub issue or PR with a public source reference. Community-maintained data stays fresh.