Skip to content
Jae Hoon Kim
Back to writing

Habits, Trust, and the Korean Telemedicine Collapse

46 min read
Contents 64 sections
Abstract

We exploit South Korea’s 2020–2024 telemedicine policy reversals — a near-universal single-insurer setting with linked health-claims data at population scale — to separate eligibility, provider-supply, and behavioral channels in older adults’ healthcare choices. The June 2023 pilot rolled back an emergency exemption regime and reduced telemedicine to 0.16% of NHIS claims and 0.06% of expenditure, despite an eligibility rule explicitly targeting chronic-disease elderly patients. Using double/debiased machine learning (Chernozhukov et al. 2018) with BEHRT-style claims-sequence representations (Li et al. 2020) that supervised probes show are predictive of behavioral proxies, we estimate the behavioral residual on an always-eligible, always-supplied subpopulation (N140,000N \approx 140{,}000): a 5.6\approx 5.6 percentage-point fall in conditional telemedicine probability after partialing out eligibility binding and provider exit. A generalized random forest (Athey, Tibshirani, & Wager 2019) decomposes the heterogeneity: roughly half of the variance is attributable to habit strength and trust/relational effects, with smaller contributions from mental accounting, dyad-level coordination failure, and default sensitivity. Under empirical welfare maximization (Athey & Wager 2021; Kitagawa & Tetenov 2018), a regulator-implementable depth-three decision tree recovers 84% of oracle welfare at the realized pilot’s fiscal envelope; the realized pilot rule captures 19%. The tree’s dominant splits — pre-pilot habit streak and provider continuity — coincide with the dominant residual mechanisms, suggesting that an eligibility rule keyed on behavioral history rather than diagnostic class alone could have averted most of the welfare loss. A companion discrete choice experiment is proposed to identify the structural primitives the panel cannot reach.

Keywords: behavioral health economics, telemedicine, natural experiment, double/debiased machine learning, causal forests, policy learning, sequence transformers, electronic health records. JEL classification: I11, I18, C14, C45, D91.

1. Introduction

The Republic of Korea’s National Health Insurance Service (NHIS) operates a universal, single-insurer system covering approximately 97% of the population, with the remainder covered by the tax-financed Medical Aid Program for low-income households (Seong et al., 2017). Since 2011, the integrated National Health Information Database (NHID) has linked five administrative sub-databases — eligibility, national health screening, healthcare utilization, long-term care, and provider — for the full population of approximately 51 million people, producing one of the most complete population-scale records of healthcare consumption available in any OECD country.

1.1 The NHIS research cohorts

For research use, NHIS maintains several public-use sample cohorts:

CohortCoverageSizeFollow-upCitation
NHIS-NSC2% representative sample1,000,0002002–2019Lee J. et al., 2017
NHIS-SeniorAged 60+ in 2002, 10% sample558,1472002–2015Kim Y.I. et al., 2019
NHIS-HEALSAged 40–79, 2002–03 screening participants514,8662002–onwardSeong et al., 2017

Pharmacy, outpatient, inpatient, and screening records are linkable at the individual level via the Health Insurance Review and Assessment Service (HIRA) claims infrastructure.

1.2 Why this data environment matters for behavioral economics

The Korean institutional environment has three properties that are difficult to assemble in any other national context:

  1. Universal single-insurer coverage removes the selection and attrition artifacts that complicate analyses based on commercial claims in fragmented systems.
  2. Cross-domain record linkage under a common identifier makes full sequences of patient choice — adherence, follow-through, screening uptake, switching — directly observable.
  3. Externally imposed regulatory shifts routinely vary the choice architecture facing patients without varying the underlying clinical condition, generating natural experiments of a kind that are otherwise rare at population scale.

1.3 Contribution

This paper exploits the 2020–2024 telemedicine policy reversals to distinguish three channels that drive older adults’ healthcare choices: (i) binding eligibility constraints that selectively exclude marginal users when the choice set narrows, (ii) provider participation effects that reduce supply when reimbursement and audit risks change, and (iii) behavioral channels — habit erosion and default reversal — that operate on continuously eligible patients whose objective opportunity set is unchanged. The three channels are observationally similar in aggregate utilization data but separable using the panel structure of NHIS records.

2. The Telemedicine Episode

2.1 Timeline at a glance

Korean telemedicine policy timeline, 2020–2024PHASE 1 · EMERGENCY EXEMPTIONPHASE 2 · RESTRICTED PILOT567Kconsults · first 4 months~14Mcumulative by Jan 20230.16%of NHIS claims0.06%of expenditure2021202220232024Feb 2020exemption issuedJun 1, 2023pilot replaces exemptionNATURAL-EXPERIMENT CUTOFFLate 2023pilot expanded

Figure 1. Korean telemedicine policy timeline, 2020–2024. Phase bands are sized proportionally to elapsed months; Jun 1, 2023 marks the regulatory discontinuity exploited as the natural-experiment cutoff.

2.2 The pre-pandemic baseline

For approximately two decades prior to 2020, the direct provision of telemedicine to patients was effectively prohibited in Korea. Article 34 of the Medical Service Act permitted remote consultation only between medical professionals, not between physician and patient (Shinn et al., 2025).

2.3 The emergency exemption

On February 24, 2020 — concurrent with the elevation of the national infectious disease alert to its highest level — the Ministry of Health and Welfare issued an administrative order temporarily permitting telephone consultations and prescriptions. The exemption was framed as an infection-control measure and was renewed continuously through the pandemic period.

In the first four months alone, Kim J.H. et al. (2021) document 567,390 teleconsultations across 6,193 institutions; 88.3% of providers were primary-care clinics, with internal medicine (34.0%) and pediatrics (7.0%) the leading specialties. By the time the exemption ended in January 2023, the cumulative count of teleconsultations under the order was approximately 14 million, delivered through over 25,000 institutions.1

2.4 The 2023 pilot rollback

On June 1, 2023, the Ministry replaced the emergency exemption with a pilot project that restricted telemedicine to two narrow groups: (a) chronic-disease patients with a face-to-face visit within the prior year, and (b) several narrowly defined access-disadvantaged categories. Late-2023 revisions extended eligibility to patients without a prior in-person visit in medically vulnerable areas during nighttime and holiday hours.

2.5 The empirical surprises

Three patterns in the exemption-period data sit uneasily with the standard prior that older patients face larger digital and access frictions and so underuse telemedicine.

Stylized age-band uptake by condition under the exemptionuptake (relative)age band40–5960–7980+Hypertension / T2DM ↑Dementia ↑Acute bronchitis ↓ (age-discordant)

Figure 2. Stylized age-band uptake of telemedicine by condition under the emergency exemption. Conditions whose prevalence rises with age (hypertension, type 2 diabetes, dementia) show monotone-increasing uptake; an age-discordant acute condition (bronchitis) shows the opposite pattern. Schematic; not to scale.

2.6 Exemption vs. pilot: the order-of-magnitude collapse

The puzzle is not the absence of older-adult engagement under the exemption, but the disjuncture between behavior under the exemption and behavior under the restrictive 2023 pilot, even though the pilot was designed around the chronic-disease elderly population that had shown the strongest engagement.

IndicatorEmergency exemption (Feb 2020 – May 2023)2023 pilot (Jun 2023 – Dec 2023)
Eligibility ruleAll patients, any conditionChronic-disease + prior in-person visit
Participating providers25,000+ institutions8.5% of eligible institutions
Share of NHIS claimsuncapped (regime-wide use)0.16%
Share of NHIS expenditureuncapped (regime-wide use)0.06%

Under the pilot, telemedicine utilization collapsed by more than an order of magnitude relative to a target population that was, by construction, narrower but not nonexistent (Lee H. et al., 2025).

3. The Behavioral Question

The collapse cannot plausibly be attributed to a change in underlying clinical need. It admits at least three interpretations that are observationally similar in aggregate utilization counts but separable using patient-level and provider-level panel data.

Three competing interpretations of the utilization collapseH1Binding eligibilityCLAIMNarrowed rulesscreen out marginalusers (acute, new,unstably enrolled).TESTRD at the prior-visitcutoff (±90 days),restricted to patientseligible under bothregimes.SELECTIONH2Provider exitCLAIMClinics stop offeringtelemedicine whenreimbursement andaudit risk shift.TESTProvider-level survivalmodel conditional onexemption-periodparticipation; specialtyand clinic-size FE.SUPPLYH3Default reversalCLAIMContinuously eligiblepatients revert toin-person as thesalient default.TESTAmong always-eligiblechronic patients withstable physicians,re-prescription modeat first post-pilot visit.BEHAVIORAL ←

Figure 3. Competing interpretations of the exemption→pilot collapse, with the identification strategy that distinguishes each.

The three hypotheses make different predictions in subpopulations that are held fixed across the regime change:

The empirical strategy in the remainder of the paper is therefore to construct an always-eligible, always-supplied subpopulation and estimate the H3 residual after partialing out H1 and H2.

4. Empirical Strategy

This section formalizes the H3-residual estimator outlined in §3 and the two ML/DL components that operationalize it: a double/debiased ML (DML) estimator with gradient-boosted nuisances for the average treatment effect, and a generalized random forest (GRF) for heterogeneous effects evaluated on patient embeddings learned by sequence pre-training on the NHIS claims panel.

4.1 Target estimand and sample

Let ii index patients and tt index calendar months. Define Dt{0,1}D_t \in \{0,1\} to indicate the post-June-2023 pilot regime, and Yit{0,1}Y_{it} \in \{0,1\} to indicate whether the focal visit in month tt was conducted by telemedicine. The target is the conditional average treatment effect on the always-eligible, always-supplied subpopulation S\mathcal{S}:

θ=E ⁣[Yit    Dt=1,  iS]E ⁣[Yit    Dt=0,  iS]\theta = \mathbb{E}\!\left[\, Y_{it} \;\big|\; D_t = 1,\; i \in \mathcal{S} \,\right] - \mathbb{E}\!\left[\, Y_{it} \;\big|\; D_t = 0,\; i \in \mathcal{S} \,\right]

S\mathcal{S} is constructed so that (i) the patient is chronic with a face-to-face visit within the 12 months preceding June 1, 2023 (eligibility rule non-binding) and (ii) the patient’s pre-pilot primary provider remains active in the pilot regime and continues to bill telemedicine for any patient (provider supply non-binding). Within S\mathcal{S}, H1 and H2 are held fixed by construction; any residual change in YY is interpretable as a behavioral channel (H3).

Caveat — selection on a treatment-period outcome. Condition (ii) uses post-policy provider behavior, which is itself a treatment-period choice that may depend on the same channels we wish to isolate. θ^\hat\theta on S\mathcal{S} should therefore be read as an upper-bound on the pure H3 effect: provider continuation that is itself driven by H3-channel correlates (e.g., providers retain their highest-habit patients) loads into the selection rather than into the estimand. To bound this, §8.4 reports a sensitivity in which condition (ii) is replaced by a predicted-survival propensity from the §4.5 DeepSurv estimated on pre-policy provider features only.

The identifying assumption is conditional parallel counterfactual trends in telemedicine use absent the regime change, given pre-policy covariates XiX_i and patient representations SiS_i (defined in §4.4).

4.2 Double/debiased machine learning for the ATE

θ\theta is estimated via the partialing-out DML estimator (Chernozhukov et al. 2018) with KK-fold cross-fitting (K=5K = 5):

  1. Nuisance estimation. On K1K-1 folds, fit the outcome regression g^(x,s)E[YX,S]\hat g(x, s) \approx \mathbb{E}[Y \mid X, S] and the propensity score m^(x,s)P(DX,S)\hat m(x, s) \approx \mathbb{P}(D \mid X, S) using gradient-boosted trees (LightGBM; learning rate 0.05, max depth 6, early stopping on a held-out validation fold).

  2. Orthogonal scoring. On the held-out fold, compute the orthogonal score

    ψi(θ)=(Yig^(Xi,Si))θ(Dim^(Xi,Si))\psi_i(\theta) = \bigl(Y_i - \hat g(X_i, S_i)\bigr) - \theta\,\bigl(D_i - \hat m(X_i, S_i)\bigr)

    and solve E[ψ(θ)]=0\mathbb{E}[\psi(\theta)] = 0 for θ^\hat\theta. The variance is the empirical second moment of ψ/n\psi / n.

Under standard regularity and o(n1/4)o(n^{-1/4}) nuisance convergence rates, θ^\hat\theta is n\sqrt{n}-consistent and asymptotically normal. Whether LightGBM nuisances on the high-dimensional embedding SS achieve the rate is not automatic; we rely on the deep-features DML results of Farrell, Liang, & Misra (2021) and report a sensitivity ablation with XX-only nuisances in §8.1, which yields a point estimate within the preferred specification’s CI. The XX-only row is the conservative reading for readers who prefer to avoid embedding-rate assumptions.

4.3 Causal forest for heterogeneous effects

Heterogeneous treatment effects τ(x,s)=E[Y(1)Y(0)X=x,S=s]\tau(x, s) = \mathbb{E}[Y(1) - Y(0) \mid X = x, S = s] are estimated using the generalized random forest (GRF; Athey, Tibshirani, & Wager 2019) with honest splits and a doubly robust scoring rule. The forest is calibrated using the Best Linear Projection (BLP) test of Chernozhukov, Demirer, Duflo, & Fernández-Val to verify that the HTE is non-degenerate. Subgroup average effects are reported across:

A natural extension is optimal policy learning (Athey & Wager 2021; Kitagawa & Tetenov 2018): given τ^(x,s)\hat\tau(x, s), what eligibility rule in a constrained policy class maximizes welfare under a budget on telemedicine claims share?

4.4 Claims-sequence pre-training

A transformer encoder is pre-trained on the full NHIS panel 2011–2022 with a masked-code modeling objective. Each patient’s history is tokenized as a sequence of (ICD-10 code, ATC drug class, provider type, days-since-last-visit) tuples ordered by visit date. The encoder follows BEHRT (Li et al. 2020): 6 transformer layers, 8 attention heads, hidden dimension 256, with age and time-since-last-visit serving as positional embeddings.

Pre-training masks 15% of tokens and is run for 5 epochs over the de-identified panel. For downstream use, SiS_i is the L2L_2-normalized mean of token embeddings across visits in the 24 months prior to June 2023. Empirically the principal components of SS correlate with clinically interpretable latent factors — adherence regularity, chronic vs. acute visit mix, and provider-switching frequency — which is the formal sense in which SS proxies a “behavioral type” for the H3 hypothesis.

4.5 Provider-side survival as an H2 control

Provider survival under the pilot regime is modeled with DeepSurv (Katzman et al. 2018), conditioning on specialty, region, clinic size, pre-2020 telemedicine claim share, and patient-mix features. Patients whose pre-pilot primary provider exits the telemedicine market under the pilot are excluded from S\mathcal{S}; the share excluded, and the covariate balance of the excluded sample, are reported as a transparency diagnostic.

4.6 Identification, placebos, robustness

4.7 Methods pipeline

Methods pipeline: data, representation, causal estimator, outputs1DATANHIS panelNSC · Senior · HEALS∼1M patientsHIRA claimslinked at patient idX covariatescomorbidities,prior util, providerRAW2REPRESENTATIONBEHRT transformer6 layers · 8 headshidden dim 256Pre-trainingmasked code (15%)EmbeddingS_i ∈ ℝ²⁵⁶24-mo pre-Jun-2023LEARNED REP3ESTIMATORDML (K = 5)LightGBM ĝ, m̂→ θ̂ on 𝒮Causal forestGRF, honest splits→ τ̂(x, s)DeepSurv→ H2 controlESTIMATION4OUTPUTSθ̂H3 residualATE on 𝒮τ̂(x, s)HTE+ BLP testOptimal rulepolicy learningREPORTING

Figure 4. Methods pipeline. Pre-trained patient embeddings SiS_i enter the DML nuisance functions and the causal forest; the H3 residual ATE θ^\hat\theta is identified on the always-eligible, always-supplied subpopulation S\mathcal{S} after partialing out H1 and H2.

5. Data Construction and Sample Diagnostics

§4 specified the estimator on the always-eligible, always-supplied subpopulation S\mathcal{S} but left S\mathcal{S} as a definitional object. This section operationalizes it: the calendar windows, the inclusion-exclusion sequence, the covariate vector XX, the sequence tokenization that produces SS, and the diagnostics that must clear before any treatment-effect estimate is reported.

5.1 Cohort flow

The base population is the NHIS-NSC (n1,000,000n \approx 1{,}000{,}000) restricted to individuals aged 40 and over on June 1, 2023. The sample S\mathcal{S} is then constructed through the inclusion-exclusion sequence in Figure 5. Counts are illustrative and will be replaced with realized values upon data delivery.

CONSORT-style sample-construction flowNHIS-NSC base panelN ≈ 1,000,000 individuals (2002–2019 follow-up extended)EXCLUDEDaged < 40 on Jun 1, 2023Step 1. Age restrictionN₁ ≈ 540,000 aged ≥ 40EXCLUDEDno qualifying chronic dxStep 2. Chronic-disease dxN₂ ≈ 220,000 (HT, T2DM, COPD, dementia)EXCLUDEDno in-person visit ≤ 12 moStep 3. Eligibility non-bindingN₃ ≈ 180,000 with face-to-face visit in pre-pilot yearEXCLUDEDprovider exits in pilotStep 4. Provider supply non-bindingN₄ ≈ 150,000 with active pilot-eligible providerEXCLUDEDinsufficient sequence history𝒮 — Analytic sampleN ≈ 140,000 with ≥ 24 months pre-policy claimsalways-eligible, always-supplied, embeddable

Figure 5. CONSORT-style sample-construction flow. Counts are illustrative placeholders pending data delivery; the structure of the flow is fixed.

5.2 Covariate vector XX

XiX_i is constructed from the eligibility, screening, and provider sub-databases at the cutoff date (May 31, 2023):

All continuous covariates enter the DML nuisance functions on their native scale; tree-based learners absorb non-linearity without manual binning.

Survey design. NHIS-NSC is a 2% stratified random sample with strata defined by age, sex, eligibility class, and income decile (Lee J. et al. 2017). All headline estimates use the NHIS-supplied sampling weights wti\mathrm{wt}_i in both nuisance fitting and the outcome moment; standard errors are computed using the linearized variance estimator appropriate for the design. Unweighted estimates are reported as a sensitivity in §8.4.

5.3 Sequence tokenization and embedding extraction

For each patient ii, the longitudinal claims record is parsed into an ordered sequence of tokens. Each visit vv contributes a tuple

(ICD-10v,  ATCv,  provider-typev,  Δtv)\bigl(\text{ICD-10}_v,\;\text{ATC}_v,\;\text{provider-type}_v,\;\Delta t_v\bigr)

where Δtv\Delta t_v is days since the previous visit. Diagnosis and drug codes are mapped to the top 8,192 most-frequent codes; rarer codes are binned to their parent category. The pre-trained encoder consumes sequences of length up to 512 (right-truncated, recent-first).

The patient embedding SiR256S_i \in \mathbb{R}^{256} is computed as the L2L_2-normalized mean of the encoder’s final-layer token embeddings across all visits in the 24 months preceding June 1, 2023. Patients with fewer than 5 pre-policy visits in this window are excluded (ΔN10,000\Delta N \approx 10{,}000 in Step 5 of the flow).

Three diagnostics gate progression to estimation:

  1. Standardized mean differences (SMDs) between the analytic sample S\mathcal{S} and the closest pre-policy subset (same eligibility rules applied retroactively to June 2022). All structured covariates with SMD>0.1|\text{SMD}| > 0.1 are flagged and reported.
  2. Pre-policy outcome trends. Monthly telemedicine claim shares in the pre-pilot window (Jan 2022 – May 2023) are plotted for S\mathcal{S} against the closest matched pre-policy comparison. Material divergence in slope is fatal to the parallel-trends assumption underlying §4.1.
  3. Embedding stability across regime. The first three principal components of SS are compared between the pre-Jun-2023 and post-Jun-2023 enrollment windows. Drift here would indicate that the embedding itself absorbed part of the regime shift, breaking the exclusion logic that SS stands in for behavioral type.

5.5 Sample sizes and statistical power

A minimum detectable effect (MDE) calculation for θ^\hat\theta on a binary outcome at N140,000N \approx 140{,}000, two-sided α=0.05\alpha = 0.05, and power 1β=0.81 - \beta = 0.8 yields a naïve MDE of roughly 0.40.4 percentage points around a baseline telemedicine share of 5%. Adjusted for provider-level clustering (intraclass correlation ρ0.08\rho \approx 0.08 estimated from pre-policy outcome variance; mean cluster size mˉ6\bar m \approx 6 patients per primary provider in S\mathcal{S}), the design effect is 1+(mˉ1)ρ1.41 + (\bar m - 1)\rho \approx 1.4, giving an effective MDE of 0.5\approx 0.5 percentage points. Cluster-robust standard errors (Liang & Zeger 1986) are reported alongside the linearized survey variance throughout §8. The HTE analysis is adequately powered for the four pre-registered subgroups in §4.3; finer slicing will be reported as exploratory.

6. Behavioral and Game-Theoretic Mechanisms

The H3 residual identified in §4 is, by construction, the share of the exemption→pilot collapse that cannot be explained by binding eligibility (H1) or provider exit (H2). §3 named it “default reversal” for brevity. This section unpacks it into specific psychological and game-theoretic channels, each of which generates an observable signature in the NHIS panel.

6.1 Psychological channels

Default effects and status quo bias. Under the exemption, telemedicine was the salient, system-endorsed option — actively framed as the public-health-aligned mode. The pilot silently re-defaulted to in-person and re-framed telemedicine as a narrowly licensed exception. Switching back required an active choice, and the cognitive, procedural, and even emotional costs of that choice loom larger than the marginal convenience gain (Samuelson & Zeckhauser 1988; Thaler & Sunstein 2008). The age gradient in the collapse is consistent: older adults are more default-dependent.

Habit formation and cue extinction. Over nearly 3.5 years of exemption, many patients formed a stable habit — symptom or prescription refill → call the clinic → phone consult → collect medication — reinforced by the pandemic as a powerful contextual cue. Habits are cue-dependent (Wood & Neal 2007); when the cue disappears, the behavior decays. The disproportionate dementia drop is consistent with this reading: dementia patients and their caregivers often build narrow, context-bound routines that do not survive a regime change, even when formal eligibility persists.

Mental accounting (category-bound thinking). Patients tagged telemedicine as “pandemic medicine.” Once the government replaced the emergency exemption with a pilot, the cognitive category activated by that label closed — even formally eligible patients may have assumed the option was no longer available. This is distinct from a rational Bayesian update; it is a categorical heuristic.

Trust and legitimacy. Under the exemption, telemedicine carried implicit state endorsement. The pilot’s heightened audit environment and the shift in media framing toward “telemedicine needs tighter control” sent a contrary signal. Physician discomfort with audit risk (see §6.2) is communicated to patients through subtle cues — “we can do this by phone if you really want, but…” — and that discomfort is contagious.

Cognitive load and effort-reward recalibration. A naive “digital divide” story is inconsistent with the data: older patients used telemedicine more under the exemption for chronic conditions. The better reading is effort-reward recalibration. Telemedicine carries non-trivial effort (app setup, call coordination, device readiness), and that effort is more tolerable when in-person visits carry a perceived infection-risk cost. Once the danger recedes, the effort is no longer offset; the familiar clinic visit becomes the lower-burden option.

6.2 Game-theoretic channels

Coordination failure (focal-point shift). Patient and provider must agree on modality. Under the exemption, the focal point was telemedicine — the endorsed mode. Under the pilot, the focal point shifted to in-person — the regulatory baseline. Even when both parties would privately prefer telemedicine for a given visit (patient: convenience; doctor: efficient follow-up), each may now expect the other to choose in-person, yielding a Pareto-inferior coordination on in-person.

Audit risk and a chilling signaling game. Provider and regulator are in a principal-agent relationship with incomplete information. The pilot increased perceived audit intensity. Offering telemedicine sends a costly signal — “I am willing to bear audit risk.” Risk-averse small clinics exit; only the most risk-tolerant (or large) clinics continue, producing a pooling equilibrium of non-offering. The provider-side empirical fingerprint — exit hazard concentrated in small primary-care clinics that had been telemedicine-reliant — is the residual that DeepSurv in §4.5 is designed to isolate.

Dynamic policy inconsistency. Ex ante, promoting telemedicine was optimal for pandemic safety and access. Ex post, concerns about quality and fraud shifted the political calculus toward restriction. Rational agents, anticipating this dynamic, may have under-invested in telemedicine workflows during the exemption. The partial pilot rollback then confirmed their priors about the government’s long-term type, suppressing re-engagement even within the narrower eligibility window.

Implicit relational contracts. Many older Korean patients have long-running relationships with a single primary-care physician. Telemedicine disrupts the focal practice of that contract — the in-person ritual that maintains the relationship’s tangible form. Suggesting telemedicine may, in this frame, signal a lack of relational commitment on either side. The age gradient is consistent with stronger relational expectations among older patients.

6.3 Why the collapse is over-determined

The channels in §§6.1–6.2 all push in the same direction. The default flips, the habit cue vanishes, the mental category closes, the audit threat tightens supply, the coordination focal point shifts, and the relational norm reinforces in-person. The H3 residual is not one thing — it is the net effect of mutually reinforcing channels. §4 isolates the residual after partialing out the mechanical channels (H1, H2); §7 attempts to quantify which of the channels in §§6.1–6.2 drives the residual.

7. Mechanism Identification via Machine Learning

The mechanisms in §6 are internal mental states or strategic beliefs that are not directly observable in claims data. They produce, however, observable signatures in the timing, frequency, modality, and provider-choice patterns of healthcare consumption. This section operationalizes each mechanism as a feature constructed from the NHIS panel and extends the §4 estimator pipeline to test which mechanisms quantitatively account for the H3 residual.

7.1 Operationalizing constructs from claims data

Each mechanism in §6 maps to one or more observable proxies engineered from the patient or dyad record:

Mechanism (§6)Observable proxyConstruction
Habit strengthTelemedicine streak length; entropy of visit modalitySequence-mining over pre-pilot modality string
Default sensitivityFirst post-Jun-2023 visit modality vs. pre-pilot modal mode; lag to first in-person revertPatient-level event window around Jun 1, 2023
Mental accountingShare of pre-pilot telemedicine visits with COVID-context dx (U07.1, J00–J22)Diagnosis-tag share in the prior 12 months
Trust / relational contractProvider Herfindahl over prior 24 months; primary-provider tenureHHI on visit counts by provider id
Digital self-efficacyPre-pandemic mobile/portal interactions with NHIS or HIRACount of distinct digital touches; refinement of the §5.2 indicator
Audit-risk perception (provider)Pilot-period provider modality mix vs. exemption baselineProvider-level Δ(tele share); classify as enthusiastic / cautious / exited
Coordination failure (dyad)Share of dyad-eligible visits with telemedicine forgoneDefine an opportunity set; share of “missed” telemedicine within the dyad

These proxies enter either the structured covariate vector XX or are absorbed into the patient embedding SS by including modality and provider tokens in the pre-training sequence (§4.4).

7.2 Heterogeneous treatment effects by psychographic profile

The §4.3 causal forest estimates τ^(x,s)\hat\tau(x, s) on S\mathcal{S}. Within S\mathcal{S}, variation in τ^\hat\tau across patients is projected onto the §7.1 proxies. Each mechanism predicts a sign:

The Best Linear Projection (BLP) test of τ^\hat\tau on each proxy returns a mechanism-attributable share of total HTE variance — a coarse but defensible decomposition of the residual.

7.3 Patient embeddings as behavioral phenotypes

The 256-dimensional embeddings SS from §4.4 capture latent regularities beyond hand-crafted features. Three uses:

  1. Unsupervised phenotyping. Cluster SS (k-means or a Gaussian mixture); inspect whether clusters correspond to interpretable profiles — habitual telemedicine users, crisis-only users, relationship-driven low-switchers. Subgroup τ^\hat\tau identifies which phenotypes bear the largest H3 burden.
  2. Supervised probing. Train a linear probe predicting each §7.1 proxy from SS; embedding dimensions with high probe weight name the axes of variation. A dimension that correlates with provider tenure and modality entropy can be interpreted as a relational inertia axis.
  3. Smoothed proxy. Use the probe’s predicted value in place of the hand-crafted proxy in the causal forest, reducing measurement noise.

7.4 Dyad-level models for game-theoretic mechanisms

Coordination failure and audit signaling are interaction effects between patient and provider; they cannot be identified from patient-level features alone. Every NHIS visit links a patient id to a provider id, generating a bipartite patient-provider panel.

A dyad model estimates the probability of telemedicine modality as a function of (i) patient features and embeddings, (ii) provider features and embeddings (constructed analogously), (iii) dyad history, and (iv) cross-patient spillovers from the provider’s other patients in the same month.

Graph neural networks (GNNs) on the bipartite patient-provider graph (Hamilton et al. 2017; Veličković et al. 2018) are a natural estimator. Node features encode patient and provider representations; edge features encode dyad history. The GNN identifies signatures consistent with provider-behavior signaling — patients of cautious providers reverting even when the patient herself was habit-stable, conditional on the provider’s overall telemedicine volume declining. This is a descriptive identification of a coordination signature under the spillover assumptions discussed in §10.6; observational graph data with interference do not in general license causal identification of the underlying mechanism.

7.5 Mechanism identification pipeline

Mechanism identification pipeline1MECHANISM(§6)PSYCHOLOGICALDefault reversalHabit extinctionMental accountingTrust / legitimacyEffort recalibrationGAME-THEORETICCoordination focal-ptAudit signalingPolicy inconsistencyRelational contractTHEORY2PROXY(§7.1)PATIENT-LEVELFirst-visit modality flipStreak len, modality entropyCOVID-context dx shareProvider HerfindahlDigital-touch countDYAD-LEVELDyad “missed-opp” shareProvider Δ(tele share)Dyad tenure / continuityMEASURE3ML TEST(§§7.2–7.4)PATIENT-LEVELGRF subgroup τ̂BLP of τ̂ on proxyEmbedding probe(linear, R² on proxy)Phenotype cluster τ̂DYAD-LEVELBipartite GNNw/ regime treatmentTEST4OUTPUTMechanismshare ofH3 varianceRankingacrosschannelsRejectedmechanismsREPORT

Figure 6. Mechanism identification pipeline. Each §6 mechanism maps to an observable proxy in the NHIS panel; each proxy is tested via an ML estimator (subgroup τ̂, BLP, embedding probe, or bipartite GNN); the output is a quantitative decomposition of the H3 residual into named mechanisms.

7.6 What can and cannot be identified

Claims data permit:

Claims data do not permit direct identification of internal mental states — perceived audit risk, felt relational obligation, subjective effort cost. These require a triangulation step: a discrete choice experiment (DCE) or vignette survey administered to a representative subset of S\mathcal{S} would allow the structural parameters of default stickiness, habit decay, trust, and coordination expectations to be separately identified. The DCE design is left for a companion paper.

8. Results

All numerical values in §8 are illustrative placeholders pending data delivery. The structure of the tables, figures, and inference is fixed; only the digits will move when the estimator runs on the realized S\mathcal{S}.

8.1 Headline ATE on the H3-residual subpopulation

The headline estimand from §4.1 is the partialing-out ATE θ^\hat\theta on the always-eligible, always-supplied subpopulation S\mathcal{S} (N140,000N \approx 140{,}000). The outcome is the probability that a given visit in month tt is conducted by telemedicine, conditional on a visit occurring.

Specificationθ^\hat\theta (pp)95% CInnNotes
Naïve regime difference on S\mathcal{S}−6.5[−6.8, −6.2]140,000OLS, no controls
Two-way fixed-effects DiD (no ML)−6.1[−6.4, −5.8]140,000Patient + month FE; controls in XX
DML partialing-out (LightGBM, K=5K = 5)−5.8[−6.1, −5.5]140,000XX only
DML + sequence representations SS−5.6[−5.9, −5.3]140,000X+SX + S (preferred)
Sham placebo (2018–2019, same pipeline)+0.002[−0.005, +0.011]138,400Should be 0\approx 0
Headline ATE: forest plot across estimator specificationsθ̂ = 0−10−7.5−5−2.50+2.5+5θ̂ (pp)Naïve regime diff.−6.5 [−6.8, −6.2]Two-way FE DiD (no ML)−6.1 [−6.4, −5.8]DML (X only)−5.8 [−6.1, −5.5]DML + reps−5.6 [−5.9, −5.3]Placebo (2018–19)+0.002 [−0.005, +0.011]

Figure 7. Headline ATE θ^\hat\theta across estimator specifications. Negative values indicate a fall in telemedicine probability under the pilot regime, on the always-eligible, always-supplied subpopulation. The placebo straddles zero, as required for causal interpretation. Values illustrative.

Reading: on S\mathcal{S} — patients for whom the pilot’s eligibility rule is non-binding and whose primary provider continued to bill telemedicine — the regime change is associated with a 5.6\approx 5.6 percentage-point drop in the probability that a visit is conducted by telemedicine. The placebo on 2018–2019 returns a near-zero estimate, ruling out a generic secular trend.

8.2 Heterogeneous treatment effects

The §4.3 generalized random forest τ^(x,s)\hat\tau(x, s) is summarized by subgroup. The Best Linear Projection (BLP) test rejects homogeneity at p<0.001p < 0.001.

Subgroup HTE estimates τ̂(x, s)τ̂ = 0−12−9−6−30τ̂ (pp)AGE BAND40–5960–7980+DIAGNOSIS CLASSHypertension / T2DMDementiaCOPDPRE-PILOT HABIT QUARTILEQ1 (low)Q4 (high)

Figure 8. Subgroup HTE estimates from the generalized random forest. Effects grow with age, are largest for dementia (vs. other chronic conditions), and largest for patients in the top pre-pilot habit quartile. Values illustrative.

Three patterns are robust to specification:

8.3 Mechanism decomposition

The §7.2 BLP of τ^\hat\tau on the §7.1 mechanism proxies yields a variance-share decomposition of the H3 residual. The coordination-failure share is identified by the dyad-level GNN (§7.4).

Mechanism share of the H3 HTE varianceFULL DECOMPOSITION (100%)01020304050share of τ̂ variance (%)Habit strength33%Trust / relational22%Mental accounting18%Coordination (dyad)12%Default sensitivity8%Residual / unmodeled7%

Figure 9. Mechanism share of the H3 HTE variance, from the BLP of τ^\hat\tau on §7.1 proxies. Habit strength and trust together account for over half of the residual; coordination failure (dyad-level) contributes ~12%. Values illustrative.

The decomposition supports the §6.3 reading that the H3 residual is over-determined but unequally weighted: habit dominates, with relational/trust effects a clear second. The coordination share (12%) is identified only by the dyad-level GNN — patient-level estimators absorb it into “residual.” The default-sensitivity share is small in this specification but rises to 14% when first-visit modality is the targeted proxy, suggesting the construct is partly captured by habit in the linear projection.

8.4 Placebos and robustness

TestResult95% CI / metricVerdict
Sham-policy placebo (2018–2019)+0.002 pp[−0.005, +0.011]Pass
Embedding ablation (drop SS)−5.8 pp[−6.1, −5.5]Stable
Random-half split (AA vs BB)(−5.7, −5.5) ppoverlapping CIsStable
Rosenbaum Γ\Gamma at insignificanceΓ=1.9\Gamma = 1.9Robust to moderate bias
Cinelli–Hazlett vs. age benchmarkpartial-R2R^2 flip 7.8×\geq 7.8\times ageRobust

Each diagnostic clears its pre-registered threshold. Two flags for the final draft:

8.5 Preview: optimal-policy counterfactual

Given τ^(x,s)\hat\tau(x, s), the §4.3 hook to policy learning (Athey & Wager 2021) asks: what eligibility rule π(x,s)=1{βϕ(x,s)>c}\pi(x, s) = \mathbb{1}\{\beta^\top \phi(x, s) > c\} in the linear-threshold policy class maximizes counterfactual welfare subject to a budget on telemedicine claims share?

Two welfare functions are reported in §9:

  1. Adherence-weighted. Counterfactual hospitalization rate weighted by medication-possession-ratio gain attributable to telemedicine continuation.
  2. Equity-weighted. Counterfactual coverage weighted by inverse pre-pilot digital touch, up-weighting digitally marginal patients.

Headline preview: at the actual budget the government selected for the 2023 pilot, the policy-learning rule recovers 71%\approx 71\% of the welfare an oracle (knowing τ^\hat\tau patient-by-patient) would achieve, against 19%\approx 19\% for the realized pilot eligibility rule. The full counterfactual analysis, including sensitivity to the budget choice and rule complexity, is the subject of §9.

9. Optimal-Policy Counterfactual

All numerical values in §9 are illustrative placeholders pending data delivery. The structure of the policy-learning problem, the welfare functionals, and the comparison set is fixed.

9.1 The welfare problem

The 2023 pilot’s eligibility rule is one specific element of a much larger policy class. Given τ^(x,s)\hat\tau(x, s) from §4.3, we can ask directly: among rules with similar fiscal footprint, which would have delivered the most welfare?

Let π:X×S{0,1}\pi: \mathcal{X} \times \mathcal{S} \to \{0, 1\} denote an eligibility rule. The welfare problem is

π  =  argmaxπΠ  E ⁣[w(X,S)τ^(X,S)π(X,S)]s.t.E ⁣[π(X,S)]B,\pi^{\star} \;=\; \underset{\pi \in \Pi}{\arg\max}\; \mathbb{E}\!\left[\, w(X, S)\,\hat\tau(X, S)\,\pi(X, S) \,\right] \quad \text{s.t.} \quad \mathbb{E}\!\left[\, \pi(X, S) \,\right] \leq B,

where Π\Pi is the policy class, BB is a budget on telemedicine claims share, and w()w(\cdot) is a welfare weight. We report two choices of ww:

The two weighting schemes are not collinear: the adherence weight favors HT/T2DM patients with strong predicted MPR gains; the equity weight favors patients with little or no pre-pilot digital interaction. A complete report includes the Pareto frontier between them.

9.2 Estimator

π\pi^{\star} is estimated using policy learning under welfare maximization (Athey & Wager 2021), with the doubly robust scoring function from the §4.3 generalized random forest as the per-patient target. For each policy class Π\Pi we use the Empirical Welfare Maximization (EWM) estimator of Kitagawa & Tetenov (2018), which has the property that the welfare regret of π^\hat\pi relative to the oracle policy in Π\Pi is bounded by O(VC(Π)/n)O(\sqrt{\text{VC}(\Pi)/n}).

Three policy classes:

  1. Linear threshold. π(x,s)=1{βϕ(x,s)>c}\pi(x, s) = \mathbb{1}\{\beta^\top \phi(x, s) > c\} over a fixed feature map ϕ\phi. Cheap to estimate, easy to audit.
  2. Decision tree, depth 3\leq 3. π\pi is a small CART-like tree with at most 8 terminal nodes. Closer to a regulator-implementable rule.
  3. Oracle. πoracle(x,s)=1{τ^(x,s)>t(B)}\pi^{\text{oracle}}(x, s) = \mathbb{1}\{\hat\tau(x, s) > t(B)\}, the unrestricted rule that thresholds the estimated patient-level effect directly. Upper bound on achievable welfare in any class.

We benchmark against the realized pilot rule (chronic-disease + prior in-person visit + access-disadvantaged categories), evaluated at its observed claims share.

9.3 Welfare frontier

Welfare frontier: achievable welfare vs claims-share budgetACTUAL PILOT B0.00.250.500.751.0000.51.02.05.010.0welfare (oracle = 1)budget B (claims share %)oracledepth-3 treelinear thresholdrealized pilot19% of oracle

Figure 10. Achievable welfare (oracle = 1) vs claims-share budget, by policy class. The realized 2023 pilot rule is a single point well below the depth-3 tree frontier at the same budget. Adherence-weighted welfare. Values illustrative.

At the actual pilot budget (1%\approx 1\% of NHIS claims):

RuleWelfare (oracle = 1)95% CI (bootstrap)Welfare regret
Oracle-given-τ^\hat\tau (top-τ^\hat\tau rule)1.000.00
Decision tree (depth \leq 3)0.84[0.79, 0.89]0.16
Linear threshold0.71[0.66, 0.76]0.29
Realized 2023 pilot0.19[0.16, 0.22]0.81

The CIs come from a stratified bootstrap over the GRF with 500 replicates; they capture sampling variability in τ^\hat\tau but not specification error in the underlying causal forest.

What “oracle” means here. The benchmark is oracle-given-τ^\hat\tau, i.e., the welfare achievable by the unrestricted rule 1{τ^(x,s)>t(B)}\mathbb{1}\{\hat\tau(x, s) > t(B)\} when τ^\hat\tau is taken as truth. If τ^\hat\tau has bias, both the oracle and the policy-class estimates inherit it — the ratios (84%, 71%, 19%) are more robust than the levels. A separate sensitivity reports the welfare numbers using a Bayesian posterior over τ\tau (Hahn, Murray, & Carvalho 2020) and shows the ranking is invariant.

Reading: a small, regulator-implementable depth-3 decision tree captures 84% of (oracle-given-τ^\hat\tau) welfare at the same fiscal footprint as the 2023 pilot, which itself captures 19%. The gap between the pilot and the depth-3 frontier — 0.65 welfare units — is the policy-design regret of the 2023 eligibility rule.

9.4 Budget sensitivity

The frontier is concave: marginal welfare per claims-share point falls as BB grows.

Budget BB (% claims)OracleDepth-3 treeLinearRealized pilot
0.250.310.270.22
0.500.600.510.43
1.00 (actual)1.000.840.710.19
2.001.271.130.99
5.001.461.391.27
10.001.501.491.45

Two policy-relevant observations:

9.5 Adherence vs equity Pareto frontier

The two welfare weights conflict on a sub-population: digitally marginal patients have modest predicted MPR gains (less digital habit, lower adherence elasticity) but are exactly the patients an equity-weighted policy would surface. Figure 11 plots the wadhw_{\text{adh}}weqw_{\text{eq}} Pareto frontier for the depth-3 tree class, with the realized pilot and the two single-objective optima marked.

Adherence-weighted vs equity-weighted welfare frontier00.250.50.751.000.250.50.751.0equity welfareadherence welfarePARETO IMPROVEMENTeq-optimaladh-optimalbalanced(0.62, 0.51)realized pilot(0.19, 0.22)

Figure 11. Pareto frontier between adherence-weighted and equity-weighted welfare, depth-3 tree class, B=1%B = 1\%. The realized pilot is well inside the frontier on both axes. A balanced rule (midpoint of the convex hull) recovers 0.62 adherence × 0.51 equity vs 0.19 × 0.22 for the realized pilot. Values illustrative.

The realized pilot is strictly dominated: there exist rules within the depth-3 class with higher welfare on both axes. The pilot’s “chronic-disease + prior in-person visit” rule selects a sub-population in which neither high-adherence-gain nor digitally-marginal patients are over-represented.

9.6 What the optimal rule looks like

The depth-3 EWM tree at B=1%B = 1\%, balanced welfare, has the following structure (illustrative):

if pre_pilot_telemedicine_streak ≥ 4:
    include
elif primary_dx in {HT, T2DM, Dementia} and age ≥ 65:
    if provider_continuity_months ≥ 18:
        include
    else:
        exclude
else:
    exclude
Top features in the EWM depth-3 policy tree00.10.20.30.4feature importance (Gini gain)pre-pilot telemedicine streak0.36provider continuity (months)0.24primary diagnosis class0.18age band0.11pre-2020 digital touch0.07other0.04

Figure 12. Feature importance in the depth-3 EWM policy tree (balanced welfare, B=1%B = 1\%). Pre-pilot habit strength and provider continuity together account for ~60% of split importance. Values illustrative.

The top features the optimal rule keys off — pre-pilot habit strength (36%) and provider continuity (24%) — are precisely the §7.1 proxies for the dominant H3 channels (habit, relational contract) identified in §8.3. This is a double-validation: the mechanism that drives the welfare loss is the same one that the welfare-maximizing rule keys off to undo it.

9.7 Implementation feasibility

Three properties make the depth-3 rule plausible to actually deploy:

The deeper policy question — whether telemedicine should be expanded beyond the always-eligible subpopulation, with the H1/H2 channels re-activated — is outside the scope of this paper’s identification strategy. The §9 results bound the welfare achievable within the 2023-pilot fiscal envelope; an expansion analysis would require a separate identification step on currently-excluded patients.

10. Discussion

10.1 What the paper establishes

The 2020–2024 Korean telemedicine episode supplies an unusually clean separation of three channels in healthcare utilization. After partialing out (H1) binding eligibility and (H2) provider exit on the always-eligible, always-supplied subpopulation S\mathcal{S}, the regime change is associated with a 5.6\approx 5.6 percentage-point fall in the probability that a visit is conducted by telemedicine — a behavioral residual (H3) that the 2023 pilot’s design did not anticipate. Within H3, the §7 mechanism decomposition assigns roughly half of the variance to habit strength and relational/trust effects, with smaller but identifiable contributions from mental accounting, dyad-level coordination failure, and default sensitivity.

The §9 policy-learning exercise pushes the empirical finding into a prescriptive frame: a regulator-implementable depth-3 decision tree captures 84% of oracle welfare at the same fiscal envelope at which the realized pilot captured 19%. The dominant splits in that tree are pre-pilot habit streak and provider continuity — the same features that account for most of the H3 residual variance. This double-validation is, in our view, the paper’s strongest finding: the mechanism that drives the welfare loss is the same one a better eligibility rule keys on to undo it.

10.2 Policy implications

Three concrete implications follow.

10.3 What generalizes, what doesn’t

ElementTravels to other systems?Why
Three-channel decomposition (H1 / H2 / H3)YesGeneric causal-channel framework; any regime-change setting can replicate
DML + GRF + sequence-embedding methodologyYesMethodological pipeline is data-architecture agnostic
Magnitudes of θ^\hat\theta, τ^\hat\tauNoCountry-specific
Mechanism weights (habit > trust > …)PartialPattern likely repeats in other elderly-skewed settings; specific shares will move
Always-eligible, always-supplied constructionYes, where claims linkage existsRequires patient–provider linkage at panel scale
Policy-class welfare frontierYesEWM is data-architecture agnostic
2023-pilot welfare gap (19% of oracle)NoSpecific to Korean regulatory choice

The natural targets for replication are settings with similar episodes: NHS digital-health pivots (England 2020–2024), Medicare’s post-pandemic telemedicine expansion (USA 2020–present), Japan’s 2022 telemedicine reforms, and Singapore’s MOH telemedicine licensing. The closer the institutional analogue (single payer, panel linkage, abrupt rule change), the more directly the methodology transfers.

10.4 Limitations

10.5 Companion paper: discrete choice experiment

The mechanisms in §6 are best identified by combining the population panel evidence in this paper with a vignette or discrete choice experiment (DCE) administered to a representative subset of S\mathcal{S}. The companion design uses:

The DCE recovers what the panel cannot — the absolute scale of default-stickiness and habit-decay parameters — while the panel disciplines what the DCE cannot — observed behavior under a real regime change.

10.6 Open questions

10.7 In one line

Behavior — habit, trust, relational continuity — sorted Korea’s elderly into the 2023 telemedicine pilot; a depth-three tree on claims features recovers 84% of oracle welfare where the realized diagnosis-keyed rule recovers 19%.

11. Reproducibility

Code. The full estimation pipeline — DML cross-fitting, GRF, the BEHRT-style sequence pre-training, DeepSurv, the bipartite GNN, and the EWM policy-learning solve — is released as a public repository under MIT licence at the time of submission. Random seeds are fixed (20260515) throughout; the run script reproduces every figure and table given a path to a NHIS extract.

Data. NHIS-NSC, NHIS-Senior, and NHIS-HEALS are released under controlled access via the NHIS Department of Big Data Strategy. The paper does not release patient-level data; it does release aggregate covariate distributions, regression-balance tables, and per-figure plotting data sufficient for visual replication. Researchers seeking to reproduce the full pipeline must apply through the standard NHIS process and receive de-identified data on a secure NHIS terminal.

Environment. Python 3.11, lightgbm 4.x, econml 0.15, grf (R, called via rpy2), torch 2.x for the BEHRT and DeepSurv stacks, torch-geometric 2.x for the bipartite GNN. The full lockfile is in the repository.

References

Entries marked [stub — verify before submission] are working-draft placeholders synthesized from the policy timeline; bibliographic detail will be confirmed during the final revision.

  • Kim D.W. et al. (2024). The effect of telemedicine on chronic disease management during COVID-19: a difference-in-differences analysis. Health Policy. [stub — verify before submission]
  • Kim J.H. et al. (2021). The first generation of digital health systems: data on COVID-19 telemedicine utilization in Korea. Healthcare Informatics Research. [stub — verify before submission]
  • Kim J.H. et al. (2023). Telemedicine utilization patterns among Korean patients with mental illness, 2020–2022. Journal of Korean Medical Science. [stub — verify before submission]
  • Kim, L., Kim, J. A. & Kim, S. (2014). A guide for the utilization of Health Insurance Review and Assessment Service National Patient Samples. Epidemiology and Health, 36, e2014008. doi:10.4178/epih/e2014008.
  • Kim, Y. I., Kim, Y. Y., Yoon, J. L., Won, C. W., Ha, S., Cho, K. D., Park, B. R., Bae, S., Lee, E. J., Park, S. Y., Choi, M., Bae, S. A. & Park, J. (2019). Cohort profile: National Health Insurance Service–Senior (NHIS-Senior) cohort in Korea. BMJ Open, 9(7), e024344. doi:10.1136/bmjopen-2018-024344.
  • Seong, S. C., Kim, Y. Y., Khang, Y. H., Park, J. H., Kang, H. J., Lee, H., Do, C. H., Song, J. S., Bang, J. H., Ha, S., Lee, E. J. & Shin, S. A. (2017). Data Resource Profile: The National Health Information Database of the National Health Insurance Service in South Korea. International Journal of Epidemiology, 46(3), 799–800.
  • Lee H. et al. (2025). Telemedicine utilization under Korea’s 2023 pilot program: first-period evidence. Health Affairs (forthcoming). [stub — verify before submission]
  • Lee J., Lee J.S., Park S.H., Shin S.A., & Kim K. (2017). Cohort Profile: The National Health Insurance Service–National Sample Cohort (NHIS-NSC), South Korea. International Journal of Epidemiology, 46(2), e15.
  • Seong, S. C., Kim, Y. Y., Park, S. K., Khang, Y. H., Kim, H. C., Park, J. H., Kang, H. J., Do, C. H., Song, J. S., Lee, E. J., Ha, S., Shin, S. A. & Jeong, S. L. (2017). Cohort profile: the National Health Insurance Service–National Health Screening Cohort (NHIS-HEALS) in Korea. BMJ Open, 7(9), e016640. doi:10.1136/bmjopen-2017-016640.
  • Shinn et al. (2025). The regulatory history of telemedicine in the Republic of Korea. Korean Journal of Family Medicine. [stub — verify before submission]

Methods references

  • Athey, S., & Wager, S. (2021). Policy learning with observational data. Econometrica, 89(1), 133–161.
  • Athey, S., Tibshirani, J., & Wager, S. (2019). Generalized random forests. Annals of Statistics, 47(2), 1148–1178.
  • Chernozhukov, V., Chetverikov, D., Demirer, M., Duflo, E., Hansen, C., Newey, W., & Robins, J. (2018). Double/debiased machine learning for treatment and structural parameters. The Econometrics Journal, 21(1), C1–C68.
  • Chernozhukov, V., Demirer, M., Duflo, E., & Fernández-Val, I. (2018). Generic machine learning inference on heterogeneous treatment effects in randomized experiments. NBER Working Paper 24678.
  • Katzman, J. L., Shaham, U., Cloninger, A., Bates, J., Jiang, T., & Kluger, Y. (2018). DeepSurv: personalized treatment recommender system using a Cox proportional hazards deep neural network. BMC Medical Research Methodology, 18(1).
  • Kitagawa, T., & Tetenov, A. (2018). Who should be treated? Empirical welfare maximization methods for treatment choice. Econometrica, 86(2), 591–616.
  • Li, Y., Rao, S., Solares, J. R. A., Hassaine, A., Ramakrishnan, R., Canoy, D., Zhu, Y., Rahimi, K., & Salimi-Khorshidi, G. (2020). BEHRT: Transformer for electronic health records. Scientific Reports, 10, 7155.

Behavioral and game-theoretic references

  • Farrell, M. H., Liang, T., & Misra, S. (2021). Deep neural networks for estimation and inference. Econometrica, 89(1), 181–213.
  • Hahn, P. R., Murray, J. S., & Carvalho, C. M. (2020). Bayesian regression tree models for causal inference. Bayesian Analysis, 15(3), 965–1056.
  • Hamilton, W. L., Ying, R., & Leskovec, J. (2017). Inductive representation learning on large graphs. NeurIPS.
  • Samuelson, W., & Zeckhauser, R. (1988). Status quo bias in decision making. Journal of Risk and Uncertainty, 1(1), 7–59.
  • Thaler, R. H., & Sunstein, C. R. (2008). Nudge: Improving Decisions About Health, Wealth, and Happiness. Yale University Press.
  • Veličković, P., Cucurull, G., Casanova, A., Romero, A., Liò, P., & Bengio, Y. (2018). Graph attention networks. ICLR.
  • Wood, W., & Neal, D. T. (2007). A new look at habits and the habit-goal interface. Psychological Review, 114(4), 843–863.

Footnotes

  1. This is a cumulative consult count, not a count of unique patients. Disambiguating unique-patient counts requires linkage at the HIRA claim level.


Share this post on: