# Legacy Production Migration Runbook

Date: 2026-05-25

See also:

- `docs/legacy_first_real_run_checklist.md` for the first production/staging execution pass after `legacy:import`

## Goal

This runbook defines the **actual operational process** for importing legacy ECOIN data into the new ECOLE ECOIN platform safely.

It is designed for real production-like execution with these constraints:

- no direct blind import into `users`
- no direct blind import into active `enrollments`
- no automatic WhatsApp sending
- no automatic payment creation
- no waiting-session to cohort conversion
- review-first, batch-based execution only

This runbook assumes the Legacy Import module is already deployed and migrated.

## Scope

This runbook covers:

- legacy DB connection setup
- import run creation
- validation
- classification
- dry-run review
- historical completed training prioritization
- safe commit slices
- rollback procedure
- production control checkpoints

This runbook does **not** approve:

- mass auto-import of all rows at once
- direct conversion of all historical registrations into modern enrollments
- bulk payment reconstruction
- attendance reconstruction

## Phase 0: Preconditions

Before starting, confirm:

1. the new platform database backup is taken
2. the old platform database is reachable from the new platform server
3. the legacy DB user has `SELECT` only
4. the admin user executing the run has:
   - `legacy_imports.view`
   - `legacy_imports.run`
   - `legacy_imports.validate`
   - `legacy_imports.review`
   - `legacy_imports.commit`
   - `legacy_imports.rollback`
5. production queue side-effects are not tied to legacy import commit paths
6. the team understands that `historical completed training` is the primary migration priority

## Phase 1: Configure the Legacy Database Connection

Set these values in the production `.env`:

```dotenv
LEGACY_IMPORT_CONNECTION=legacy
LEGACY_IMPORT_FOCUS_MODE_DEFAULT=historical_completed

LEGACY_DB_DRIVER=mysql
LEGACY_DB_HOST=127.0.0.1
LEGACY_DB_PORT=3306
LEGACY_DB_DATABASE=ecoin_legacy
LEGACY_DB_USERNAME=ecoin_legacy_reader
LEGACY_DB_PASSWORD=strong-password
LEGACY_DB_SOCKET=
LEGACY_DB_CHARSET=utf8mb4
LEGACY_DB_COLLATION=utf8mb4_unicode_ci
LEGACY_DB_STRICT=true
# LEGACY_MYSQL_ATTR_SSL_CA=/path/to/ca.pem
```

Then reload config:

```bash
php artisan config:clear
```

## Phase 2: Verify the Connection

Run:

```bash
php artisan legacy:import --connection=legacy
```

Expected result:

- a new import run is created
- the command prints the run reference
- the command prints `Legacy connection used: legacy`
- the command prints the reference counts for:
  - legacy formations
  - legacy sessions
  - legacy students
- the command prints a sample of imported formation names
- the command prints the direct course-mapping review URL for the run
- no data is committed into active business tables yet

If this fails:

- stop immediately
- do not retry blindly
- verify host, port, credentials, firewall, and DB user privileges

## Phase 3: Build the Run

After import, run:

```bash
php artisan legacy:validate --run=RUN_ID
php artisan legacy:classify --run=RUN_ID
php artisan legacy:dry-run --run=RUN_ID
```

Use the generated `RUN_ID` from the import step.

Expected result:

- staging rows exist
- validation statuses are assigned
- classifications are assigned
- dry-run summary is produced

## Phase 4: Open the Admin Review Screens

Use these pages:

- `/admin/legacy-imports`
- `/admin/legacy-imports/RUN_ID`
- `/admin/legacy-imports/students?run=RUN_ID`
- `/admin/legacy-imports/review?run=RUN_ID`
- `/admin/legacy-imports/review?run=RUN_ID&focusMode=historical_completed`
- `/admin/legacy-imports/course-mapping?run=RUN_ID`
- `/admin/legacy-imports/session-mapping?run=RUN_ID`

Operational rule:

- start with `historical_completed`
- do not start from `lead_only`
- do not start from `registered_pending`

## Phase 5: Dry-Run Decision Gate

Before any commit, review the dry-run summary.

### Mandatory checks

Review:

1. completed history count
2. archived training count
3. registered pending count
4. lead-only count
5. duplicate count
6. placeholder session count
7. unmapped course count
8. unknown or needs-review count

### Stop conditions

Do **not** commit yet if:

- unmapped courses are still too high
- many rows are landing in `unknown`
- placeholder sessions are being mistaken for real sessions
- duplicate hints look suspicious
- completed-history classification looks clearly wrong on sampled rows

## Phase 6: Mapping Review

Review these first:

1. `legacy_course_mappings`
2. `legacy_session_mappings`

Priority:

1. mappings used by `completed` and `archived_training`
2. mappings used by `registered`
3. mappings used by lead-only rows

Rules:

- do not force a course match if confidence is weak
- leave weak records in `needs_review`
- keep waiting sessions as `placeholder_session`
- do not upgrade placeholder sessions into real sessions just to “finish the import”

## Phase 7: Historical Completed Review First

This is the core production recommendation for your data.

Because most rows represent people who already trained, start here:

- `/admin/legacy-imports/review?run=RUN_ID&focusMode=historical_completed`

Sample at least:

1. 20 records from `completed`
2. 20 records from `archived_training`
3. 10 records with duplicate hints
4. 10 records with missing or unusual course/session references

Confirm:

- names look sane
- phones are normalized correctly
- invalid placeholder emails are ignored
- classification matches business reality
- import action makes sense for historical training

## Phase 8: Export the Review Report

From the run detail page, export:

- masked CSV
- JSON summary

Use these exports for:

- stakeholder review
- migration sign-off
- audit trail

Recommended sign-off rule:

- no safe commit before one human review of the exported dry-run

## Phase 9: First Safe Commit Batch

Start with a **small batch only**.

Recommended first batch:

- `25` rows maximum
- historical completed rows only

Run:

```bash
php artisan legacy:commit --run=RUN_ID --batch=25
```

Or trigger the safe commit from the run detail page if you want UI-driven execution.

Expected safe outputs only:

- CRM leads for supported `lead_only` rows
- `legacy_completed_trainings` for supported historical rows
- `legacy_imported_enrollments` only for safe registered rows with real sessions
- links to existing users only on high-confidence duplicate paths

No expected outputs:

- payments
- attendance
- WhatsApp sends
- direct active enrollment orchestration

## Phase 10: Post-Commit Verification

After the first batch, verify:

1. the run summary changed as expected
2. commit logs were written
3. no unexpected users were overwritten
4. no modern enrollments were mutated unexpectedly
5. no placeholder-session row was committed as a real enrollment
6. historical completed rows landed in compatibility tables only

Review manually in:

- `/admin/legacy-imports/RUN_ID`
- `/admin/legacy-imports/review?run=RUN_ID&focusMode=historical_completed`

## Phase 11: Rollback Drill

Before running larger batches, perform one rollback drill.

Run:

```bash
php artisan legacy:rollback --run=RUN_ID
```

Expected behavior:

- records created by the legacy import safe slice are removed or reset
- pre-existing users remain intact
- rollback logs are recorded

If rollback does not behave exactly as expected:

- stop the migration
- do not proceed to larger batches

## Phase 12: Scale Up in Controlled Waves

Only after a successful first batch and rollback drill:

1. commit 100 rows
2. verify again
3. commit the next 100 rows
4. verify again

Recommended scaling pattern:

- batch 25
- batch 100
- batch 100
- batch 250 only if previous waves are clean

Do not jump from 25 to 3000.

## Recommended Operational Order

For your data, the best execution order is:

1. `completed`
2. `archived_training`
3. `registered` with real sessions only
4. `lead_only`
5. leave ambiguous rows in `needs_review`

This keeps the historical truth intact and avoids polluting CRM or modern enrollment flows.

## Safety Rules

Always follow these rules:

- never commit the entire run in one shot
- never unmask raw payload in broad exports
- never promote waiting sessions into cohorts automatically
- never import payments without a validated payment source table
- never treat all legacy rows as leads
- never overwrite a pre-existing user from a legacy row

## Incident Response

If anything looks wrong:

1. stop committing immediately
2. export the run summary
3. inspect commit logs
4. rollback the run
5. review mappings and review actions
6. relaunch only after human re-approval

## Final Go/No-Go Checklist

Only proceed with broader migration if all are true:

- legacy connection verified
- dry-run reviewed
- historical completed sample approved
- mappings reviewed
- first safe batch passed
- rollback drill passed
- commit logs look correct
- no unexpected business-side mutations observed

If one item is false, treat the run as **not ready**.