PPP Loans
Full schema reference, querying examples, aggregate functions, and data quality notes for the PPP loan dataset.
The ppp_loans table contains ~11.5 million Paycheck Protection Program loan records from the SBA's FOIA dataset.
Source
| Property | Value |
|---|---|
| Source | SBA FOIA Portal |
| Total records | ~11,500,000 |
| Last SBA update | September 30, 2024 |
| Refresh strategy | Static (PPP program has ended) |
Sample Record
A complete JSON object returned by the API (with select=*):
{
"id": 7654321,
"loan_number": "3856107205",
"date_approved": "2021-02-15",
"sba_office_code": "0361",
"processing_method": "PPS",
"loan_status": "Paid in Full",
"loan_status_date": "2022-06-01",
"term": 60,
"sba_guaranty_percentage": 100,
"initial_approval_amount": 187500.00,
"current_approval_amount": 187500.00,
"undisbursed_amount": 0,
"forgiveness_amount": 187500.00,
"forgiveness_date": "2022-03-15",
"borrower_name": "TONY'S ITALIAN KITCHEN LLC",
"borrower_address": "1234 MAIN ST",
"borrower_city": "AUSTIN",
"borrower_state": "TX",
"borrower_zip": "78701",
"business_type": "Limited Liability Company(LLC)",
"naics_code": "722511",
"franchise_name": null,
"business_age_description": "Existing or more than 2 years old",
"jobs_reported": 35,
"non_profit": null,
"rural_urban_indicator": "U",
"hubzone_indicator": "N",
"lmi_indicator": null,
"race": "Unanswered",
"ethnicity": "Unknown/NotStated",
"gender": "Unanswered",
"veteran": "Unanswered",
"cd": "TX-25",
"project_city": "AUSTIN",
"project_county_name": "TRAVIS",
"project_state": "TX",
"project_zip": "78701",
"servicing_lender_name": "Wells Fargo Bank, National Association",
"servicing_lender_location_id": "56267",
"servicing_lender_address": "101 N PHILLIPS AVE",
"servicing_lender_city": "SIOUX FALLS",
"servicing_lender_state": "SD",
"servicing_lender_zip": "57104",
"originating_lender": "Wells Fargo Bank, National Association",
"originating_lender_location_id": "56267",
"originating_lender_city": "SIOUX FALLS",
"originating_lender_state": "SD",
"utilities_proceed": 5000.00,
"payroll_proceed": 165000.00,
"mortgage_interest_proceed": 0,
"rent_proceed": 17500.00,
"refinance_eidl_proceed": 0,
"health_care_proceed": 0,
"debt_interest_proceed": 0
}The naics_code value 722511 means "Full-Service Restaurants" — see the NAICS Code Reference for the full hierarchy.
Schema
Loan Identification
| Column | Type | Description |
|---|---|---|
id | bigint | Internal primary key |
loan_number | text | SBA-assigned loan ID (unique) |
date_approved | date | Loan approval date |
sba_office_code | text | SBA district office code |
processing_method | text | PPP (first draw) or PPS (second draw) |
loan_status | text | Current loan status |
loan_status_date | date | Date of last status change |
term | integer | Loan term in months |
sba_guaranty_percentage | numeric | SBA guaranty percentage |
Financial Amounts
| Column | Type | Description |
|---|---|---|
initial_approval_amount | numeric | Originally approved amount |
current_approval_amount | numeric | Current approved amount |
undisbursed_amount | numeric | Amount not yet disbursed |
forgiveness_amount | numeric | Amount forgiven |
forgiveness_date | date | Date forgiveness was processed |
Borrower Information
| Column | Type | Description |
|---|---|---|
borrower_name | text | Business name |
borrower_address | text | Street address |
borrower_city | text | City |
borrower_state | text | 2-letter state code |
borrower_zip | text | 5-digit ZIP code |
Business Characteristics
| Column | Type | Description |
|---|---|---|
business_type | text | Business entity type |
naics_code | text | 6-digit NAICS industry code — hierarchical, use like for prefix queries |
franchise_name | text | Franchise name (if applicable) |
business_age_description | text | Business age description |
jobs_reported | integer | Number of jobs reported |
non_profit | text | Y or NULL |
rural_urban_indicator | text | R or U |
hubzone_indicator | text | Y or N |
lmi_indicator | text | Y or NULL |
Demographics (Self-Reported)
| Column | Type | Description |
|---|---|---|
race | text | Race (high NULL rate) |
ethnicity | text | Ethnicity (high NULL rate) |
gender | text | Gender (high NULL rate) |
veteran | text | Veteran status |
Geographic / Congressional
| Column | Type | Description |
|---|---|---|
cd | text | Congressional district (e.g., CA-12) |
project_city | text | Project location city |
project_county_name | text | Project county |
project_state | text | Project state |
project_zip | text | Project ZIP code |
Lender Information
| Column | Type | Description |
|---|---|---|
servicing_lender_name | text | Servicing lender name |
servicing_lender_location_id | text | Servicing lender location ID |
servicing_lender_address | text | Servicing lender address |
servicing_lender_city | text | Servicing lender city |
servicing_lender_state | text | Servicing lender state |
servicing_lender_zip | text | Servicing lender ZIP |
originating_lender | text | Originating lender name |
originating_lender_location_id | text | Originating lender location ID |
originating_lender_city | text | Originating lender city |
originating_lender_state | text | Originating lender state |
Proceeds Breakdown
| Column | Type | Description |
|---|---|---|
utilities_proceed | numeric | Utilities portion |
payroll_proceed | numeric | Payroll portion |
mortgage_interest_proceed | numeric | Mortgage interest portion |
rent_proceed | numeric | Rent portion |
refinance_eidl_proceed | numeric | EIDL refinance portion |
health_care_proceed | numeric | Healthcare portion |
debt_interest_proceed | numeric | Debt interest portion |
Search
| Column | Type | Description |
|---|---|---|
borrower_name_search | tsvector | Full-text search vector (auto-generated) |
Fast-Lookup Columns
These columns are optimized for fast filtering and sorting:
loan_number(unique)borrower_stateborrower_zipborrower_city(when combined withborrower_state) — case-insensitive matchingnaics_codeinitial_approval_amountforgiveness_amountjobs_reporteddate_approvedprocessing_methodloan_statusborrower_name_search(full-text search)
At least one of these columns must appear as a filter in table queries. borrower_city requires borrower_state as a co-filter (composite index). Guardrail violations return a 400 response.
Querying Examples
For general query syntax (operators, pagination, sorting), see the Query Syntax Reference.
Find businesses by name in a state
curl "https://newbizbot.ai/api/data/ppp_loans?borrower_name_search=fts.starbucks&borrower_state=eq.WA&limit=5" \
-H "Authorization: Bearer YOUR_API_KEY"[
{
"borrower_name": "STARBUCKS CORPORATION",
"borrower_city": "SEATTLE",
"borrower_state": "WA",
"initial_approval_amount": 0,
"naics_code": "722515",
"jobs_reported": 228769,
"date_approved": "2020-04-10",
"loan_status": "Exemption 4"
}
]Largest loans in an industry
# Top 5 construction loans in Florida
curl "https://newbizbot.ai/api/data/ppp_loans?borrower_state=eq.FL&naics_code=like.23*&order=initial_approval_amount.desc&limit=5&select=borrower_name,borrower_city,initial_approval_amount,naics_code,jobs_reported" \
-H "Authorization: Bearer YOUR_API_KEY"[
{
"borrower_name": "AJAX BUILDING COMPANY, LLC",
"borrower_city": "TAMPA",
"initial_approval_amount": 10000000.00,
"naics_code": "236220",
"jobs_reported": 500
},
{
"borrower_name": "COASTAL CONSTRUCTION GROUP OF SOUTH FLORIDA, INC.",
"borrower_city": "MIAMI",
"initial_approval_amount": 10000000.00,
"naics_code": "236220",
"jobs_reported": 437
}
]Loans by ZIP code
# PPP loans in downtown Manhattan (ZIP 10007)
curl "https://newbizbot.ai/api/data/ppp_loans?borrower_zip=eq.10007&order=initial_approval_amount.desc&limit=5" \
-H "Authorization: Bearer YOUR_API_KEY"Second-draw loans only
# Second-draw PPP loans (PPS) in healthcare
curl "https://newbizbot.ai/api/data/ppp_loans?processing_method=eq.PPS&naics_code=like.62*&borrower_state=eq.TX&limit=10" \
-H "Authorization: Bearer YOUR_API_KEY"Loans without forgiveness
# Loans in California that haven't been forgiven yet
curl "https://newbizbot.ai/api/data/ppp_loans?borrower_state=eq.CA&forgiveness_amount=is.null&initial_approval_amount=gt.100000&order=initial_approval_amount.desc&limit=10" \
-H "Authorization: Bearer YOUR_API_KEY"Non-profit organizations
# Non-profit PPP recipients in New York
curl "https://newbizbot.ai/api/data/ppp_loans?non_profit=eq.Y&borrower_state=eq.NY&order=initial_approval_amount.desc&limit=10&select=borrower_name,borrower_city,initial_approval_amount,naics_code,jobs_reported" \
-H "Authorization: Bearer YOUR_API_KEY"Natural Language → API Translation
These examples show how to translate common human questions into API calls. Each example breaks down the reasoning for filter selection.
"Find one-person legal recruiting firms in the DC/MD/VA area"
Thinking: "Legal recruiting" = staffing industry (NAICS 561311, 561312) + "legal" in name. "One-person" = jobs_reported=eq.1. "DC/MD/VA" = multi-state in.(DC,MD,VA).
# Step 1: Broad query — all staffing firms with 1 employee in DMV
?borrower_state=in.(DC,MD,VA)&jobs_reported=eq.1&naics_code=in.(561311,561312)&order=initial_approval_amount.desc&limit=100
# Step 2: Narrow — only firms with "legal" in the name
?borrower_state=in.(DC,MD,VA)&jobs_reported=eq.1&naics_code=in.(561311,561312)&borrower_name=ilike.*legal*&limit=100Note:
borrower_nameis not an indexed filter column — it works here as an additional filter alongside the indexedborrower_stateandnaics_codefilters. At least one fast-lookup column must always be present.
"What are the biggest restaurants in Miami?"
Thinking: "Restaurants" = NAICS sector 7225. "Biggest" = sort by loan amount (proxy for size). "Miami" = city filter requires state co-filter (FL).
?borrower_state=eq.FL&borrower_city=eq.MIAMI&naics_code=like.7225*&order=initial_approval_amount.desc&limit=20&select=borrower_name,borrower_city,initial_approval_amount,naics_code,jobs_reported"Show me PE firms and hedge funds in New York"
Thinking: PE/VC/hedge funds = NAICS 523999. "New York" = state NY (for the state) or add city filter for NYC specifically.
# All of New York state
?borrower_state=eq.NY&naics_code=eq.523999&order=initial_approval_amount.desc&limit=50&select=borrower_name,borrower_city,initial_approval_amount,jobs_reported,business_type
# Just Manhattan (narrow by city — URL-encode spaces: NEW%20YORK)
?borrower_state=eq.NY&borrower_city=eq.NEW%20YORK&naics_code=eq.523999&order=initial_approval_amount.desc&limit=50"Find small dental practices in Texas that didn't get their loans forgiven"
Thinking: "Dental" = NAICS 621210. "Small" = low jobs_reported. "Didn't get forgiven" = forgiveness_amount is null. "Texas" = state TX.
?borrower_state=eq.TX&naics_code=eq.621210&forgiveness_amount=is.null&jobs_reported=lte.5&order=initial_approval_amount.desc&limit=50&select=borrower_name,borrower_city,initial_approval_amount,jobs_reported,loan_status"How many coffee shops got PPP loans in California?"
Thinking: "Coffee shops" = NAICS 722515. "How many" = count query. "California" = state CA.
# Count only (no rows returned)
curl -I "https://newbizbot.ai/api/data/ppp_loans?borrower_state=eq.CA&naics_code=eq.722515&limit=0" \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Prefer: count=exact"
# Read the Content-Range response header for the count"Find IT consulting companies with 10-50 employees in the Tri-state area"
Thinking: "IT consulting" = NAICS 541512. "10-50 employees" = range filter on jobs_reported. "Tri-state" = NY, NJ, CT.
?borrower_state=in.(NY,NJ,CT)&naics_code=eq.541512&jobs_reported=gte.10&jobs_reported=lte.50&order=jobs_reported.desc&limit=100&select=borrower_name,borrower_city,borrower_state,initial_approval_amount,jobs_reportedNote: Repeating the same parameter (
jobs_reported=gte.10&jobs_reported=lte.50) is valid PostgREST syntax — both conditions are ANDed together to create a range filter.
"List all nonprofits in healthcare across the Southeast"
Thinking: "Nonprofits" = non_profit=eq.Y. "Healthcare" = NAICS sector 62. "Southeast" = multi-state region. Note: non_profit is not an indexed column, but works as an additional filter alongside the indexed borrower_state filter.
?borrower_state=in.(GA,FL,NC,SC,AL,TN,MS,LA)&naics_code=like.62*&non_profit=eq.Y&order=initial_approval_amount.desc&limit=100&select=borrower_name,borrower_city,borrower_state,naics_code,initial_approval_amount,jobs_reported"Compare PPP loan totals for restaurants vs construction in Florida"
Thinking: This is an aggregate query — use the RPC endpoint, not the table.
# Restaurants in FL
curl "https://newbizbot.ai/api/data/rpc/ppp_stats_by_industry?p_state=FL&p_naics_prefix=7225" \
-H "Authorization: Bearer YOUR_API_KEY"
# Construction in FL
curl "https://newbizbot.ai/api/data/rpc/ppp_stats_by_industry?p_state=FL&p_naics_prefix=23" \
-H "Authorization: Bearer YOUR_API_KEY"Translation Cheat Sheet
| Human concept | API filter |
|---|---|
| Geographic region | borrower_state=in.(STATE1,STATE2,...) |
| Specific city | borrower_state=eq.XX&borrower_city=eq.CITY (state required) |
| By ZIP code | borrower_zip=eq.XXXXX |
| Industry / sector | naics_code=like.XX* or naics_code=eq.XXXXXX (see NAICS codes) |
| Niche within industry | NAICS filter + borrower_name=ilike.*keyword* |
| Company size | jobs_reported=eq.N or jobs_reported=gte.N&jobs_reported=lte.M |
| Loan size | initial_approval_amount=gt.N |
| Forgiveness status | forgiveness_amount=is.null (unforgiven) or forgiveness_amount=gt.0 |
| First vs second draw | processing_method=eq.PPP or processing_method=eq.PPS |
| Nonprofits | non_profit=eq.Y (not indexed — combine with an indexed filter) |
| Business structure | business_type=eq.Corporation or eq.Sole%20Proprietorship etc. (not indexed — combine with an indexed filter) |
| "How many" | Use Prefer: count=exact header + limit=0 |
| "Total/average" | Use RPC: /rpc/ppp_stats_by_state or /rpc/ppp_stats_by_industry |
| "Biggest / top N" | order=initial_approval_amount.desc&limit=N |
| "Smallest" | order=initial_approval_amount.asc&limit=N |
Aggregate Functions (RPCs)
The API provides pre-built RPC endpoints for aggregate queries and paginated lookups. These are optimized and typically return results in milliseconds.
ppp_stats_by_state
Returns aggregate statistics grouped by state.
Endpoint: GET /data/rpc/ppp_stats_by_state
Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
p_state | text | No | Filter to a specific state (2-letter code). Omit for all states. |
Examples
# Single state
curl "https://newbizbot.ai/api/data/rpc/ppp_stats_by_state?p_state=CA" \
-H "Authorization: Bearer YOUR_API_KEY"[
{
"state": "CA",
"total_loans": 1025643,
"total_amount": 98765432100.00,
"avg_amount": 96296.50,
"total_jobs": 12345678,
"total_forgiven": 87654321000.00
}
]# All states
curl "https://newbizbot.ai/api/data/rpc/ppp_stats_by_state" \
-H "Authorization: Bearer YOUR_API_KEY"ppp_stats_by_industry
Returns aggregate statistics grouped by NAICS industry code.
Endpoint: GET /data/rpc/ppp_stats_by_industry
Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
p_state | text | No | Filter to a specific state. Omit for nationwide. |
p_naics_prefix | text | No | Filter by NAICS code prefix (e.g., 72 for Accommodation and Food Services). See NAICS Code Reference. |
Examples
# Restaurant industry in Texas
curl "https://newbizbot.ai/api/data/rpc/ppp_stats_by_industry?p_state=TX&p_naics_prefix=7225" \
-H "Authorization: Bearer YOUR_API_KEY"[
{
"naics_code": "722511",
"total_loans": 18432,
"total_amount": 2156789000.00,
"avg_amount": 117023.45,
"total_jobs": 287654
},
{
"naics_code": "722513",
"total_loans": 15678,
"total_amount": 987654000.00,
"avg_amount": 63001.15,
"total_jobs": 198765
}
]# All industries in Texas
curl "https://newbizbot.ai/api/data/rpc/ppp_stats_by_industry?p_state=TX" \
-H "Authorization: Bearer YOUR_API_KEY"
# Healthcare nationwide
curl "https://newbizbot.ai/api/data/rpc/ppp_stats_by_industry?p_naics_prefix=62" \
-H "Authorization: Bearer YOUR_API_KEY"Returns up to 100 industry codes, sorted by total amount descending.
Combining with NAICS Hierarchy
Use the p_naics_prefix parameter at different levels of the NAICS hierarchy to zoom in or out:
# Broad: entire Accommodation & Food Services sector (72)
curl "https://newbizbot.ai/api/data/rpc/ppp_stats_by_industry?p_state=NY&p_naics_prefix=72" \
-H "Authorization: Bearer YOUR_API_KEY"
# Narrow: just restaurants (7225)
curl "https://newbizbot.ai/api/data/rpc/ppp_stats_by_industry?p_state=NY&p_naics_prefix=7225" \
-H "Authorization: Bearer YOUR_API_KEY"
# Exact: full-service restaurants only (722511)
curl "https://newbizbot.ai/api/data/rpc/ppp_stats_by_industry?p_state=NY&p_naics_prefix=722511" \
-H "Authorization: Bearer YOUR_API_KEY"ppp_loans_by_city
Returns paginated loan records for a specific city, sorted by loan amount descending. Pagination is fast even on high-cardinality cities (e.g., Los Angeles).
Endpoint: GET /data/rpc/ppp_loans_by_city
Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
p_state | text | Yes | 2-letter state code |
p_city | text | Yes | City name (case-insensitive) |
p_limit | integer | No | Page size (default 25, max 1,000) |
p_offset | integer | No | Number of rows to skip (default 0) |
Examples
# First 25 loans in Los Angeles, CA
curl "https://newbizbot.ai/api/data/rpc/ppp_loans_by_city?p_state=CA&p_city=LOS%20ANGELES" \
-H "Authorization: Bearer YOUR_API_KEY"[
{
"id": 1234567,
"loan_number": "3856107205",
"borrower_name": "ACME CORP",
"borrower_city": "LOS ANGELES",
"borrower_state": "CA",
"initial_approval_amount": 10000000.00,
"forgiveness_amount": 10000000.00,
"jobs_reported": 500,
"naics_code": "722511",
"business_type": "Corporation",
"date_approved": "2020-04-10",
"total_count": 148523
}
]# Page 2 (rows 26-50)
curl "https://newbizbot.ai/api/data/rpc/ppp_loans_by_city?p_state=CA&p_city=LOS%20ANGELES&p_limit=25&p_offset=25" \
-H "Authorization: Bearer YOUR_API_KEY"The total_count field is included on every row, so you can build pagination UI without a separate count request.
ppp_loans_by_state_sector
Returns paginated loan records filtered by state and one or more 2-digit NAICS sectors, sorted by loan amount descending.
Endpoint: GET /data/rpc/ppp_loans_by_state_sector
Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
p_state | text | Yes | 2-letter state code |
p_naics_prefixes | text[] | Yes | Array of 2-digit NAICS sector codes (e.g., {"72","62"}) |
p_limit | integer | No | Page size (default 25, max 1,000) |
p_offset | integer | No | Number of rows to skip (default 0) |
Examples
# Restaurants (72) in Texas
curl "https://newbizbot.ai/api/data/rpc/ppp_loans_by_state_sector?p_state=TX&p_naics_prefixes=%7B72%7D" \
-H "Authorization: Bearer YOUR_API_KEY"
# Restaurants + Healthcare (72, 62) in New York
curl "https://newbizbot.ai/api/data/rpc/ppp_loans_by_state_sector?p_state=NY&p_naics_prefixes=%7B72,62%7D&p_limit=50" \
-H "Authorization: Bearer YOUR_API_KEY"Note: The
p_naics_prefixesparameter is an array. URL-encode the curly braces:{72,62}becomes%7B72,62%7D.
ppp_loans_by_city_sector
Returns paginated loan records filtered by state, city, and one or more 2-digit NAICS sectors, sorted by loan amount descending.
Endpoint: GET /data/rpc/ppp_loans_by_city_sector
Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
p_state | text | Yes | 2-letter state code |
p_city | text | Yes | City name (case-insensitive) |
p_naics_prefixes | text[] | Yes | Array of 2-digit NAICS sector codes |
p_limit | integer | No | Page size (default 25, max 1,000) |
p_offset | integer | No | Number of rows to skip (default 0) |
Examples
# Healthcare businesses in Miami, FL
curl "https://newbizbot.ai/api/data/rpc/ppp_loans_by_city_sector?p_state=FL&p_city=MIAMI&p_naics_prefixes=%7B62%7D" \
-H "Authorization: Bearer YOUR_API_KEY"
# Restaurants + Retail in Chicago, IL
curl "https://newbizbot.ai/api/data/rpc/ppp_loans_by_city_sector?p_state=IL&p_city=CHICAGO&p_naics_prefixes=%7B72,44%7D&p_limit=50" \
-H "Authorization: Bearer YOUR_API_KEY"Both ppp_loans_by_state_sector and ppp_loans_by_city_sector include a total_count field on every row for easy pagination.
Counting Rows
For simple row counts, you don't need an RPC function. Use the Prefer: count=exact header with limit=0:
curl -I "https://newbizbot.ai/api/data/ppp_loans?borrower_state=eq.CA&limit=0" \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Prefer: count=exact"The count is in the Content-Range response header: Content-Range: */48523
Known Data Quality Issues
This is raw SBA data loaded as-is. Known issues include:
| Issue | Scope | Details |
|---|---|---|
| Wrong congressional districts | ~226,000 loans | Loans credited to incorrect CDs |
| Inconsistent city spellings | Widespread | 35+ spellings of "Chicago" alone |
| Invalid state codes | Rare | "FI" instead of "FL", "XX" for unknown |
| Inflated job counts | ~192 loans | Sole proprietorships claiming 500 jobs |
| Business type mismatches | ~23,914 loans | LLC/LLP in name but different classification |
| Missing demographics | Majority | Most race/ethnicity/gender fields are "Unanswered" or NULL |
| Rounded loan amounts | ~94% of loans under $150K | Amounts end with zero despite being payroll-calculated |
| NULL forgiveness data | Varies | Some loans still processing or denied |