Database Refactoring Blueprint

The Ultimate Guide to
Scaling Your Specifications System

An advanced, all-inclusive blueprint covering problem analysis, architectural comparisons, and a complete, week-by-week migration plan.

Architectural Analysis: The Four Designs

Design 1: The Original "Wide Table" Schema

The current state. Simple for basic queries but fundamentally unscalable, leading to the hard limit on foreign keys across multiple core tables (`jobs`, `tasks`, `talent_profiles`, etc.).

`jobs` (Original)
idINT (PK)Primary Key
source_language_idINT (FK)Rigid spec column
... 50+ more ...INT (FK)Leads to limit
`talent_profiles` (Original)
idINT (PK)Primary Key
source_language_idINT (FK)Same column repeated
... 50+ more ...INT (FK)Difficult to maintain

Design 2: The EAV (Entity-Attribute-Value) Model

A flexible model that centralizes all specification values into a single, generic table, but sacrifices data integrity (e.g., numbers stored as text) and is notoriously slow for typed queries and reports.

`specification_types`
idINT (PK)Type's unique ID
nameVARCHAR'Source Language'
`specification_values`
idINT (PK)Value's unique ID
spec_type_idINT (FK)Links to type
valueVARCHARPoor data integrity
`entity_specifications`
entity_idINTe.g., job_id
spec_value_idINT (FK)Links to the value

Design 3: Separate "Concrete" Tables

Creates a dedicated linking table per entity. This appears clean but makes cross-entity queries (e.g., `jobs` vs `talents`) extremely complex and slow, while requiring schema changes to add new specifiable entities.

`jobs_specifications`
job_idINT (FK)Links to jobs
spec_type_idINT (FK)Links to type
spec_value_idINTLinks to value ID
`tasks_specifications`
task_idINT (FK)Links to tasks
spec_type_idINT (FK)Links to type
spec_value_idINTLinks to value ID

Design 4: The Polymorphic Model (Recommended)

The definitive solution. A single, powerful linking table serves all entities. It is highly scalable, maintainable, and performant for your system's complex needs.

`specification_types`
idINT (PK)Type's unique ID
nameVARCHAR'Source Language'
target_tableVARCHAR'source_languages'
`specifications`
specifiable_idINTe.g., job_id, task_id
specifiable_typeVARCHAR'Job', 'Task'
spec_type_idINT (FK)Links to type
spec_value_idINTLinks to value ID

Head-to-Head Comparison of All Designs

CriteriaDesign 1 (Wide)Design 2 (EAV)Design 3 (Separate)Design 4 (Polymorphic)
ScalabilityNone.High, but with major trade-offs.Medium. Requires schema changes.Very High. No schema changes. Winner
Performance (Complex Queries like Talent Matching)Okay.Very Poor.Poor.Excellent. Winner
Code MaintenanceDifficult.Very Complex.Very High (Duplicated).Low (Reusable). Winner

The Final Blueprint: Plan & Queries

Part 1: The Complete SQL Query Library

Schema Definition (DDL)

-- Table to define the types of specifications
CREATE TABLE `specification_types` (
  `id` INT NOT NULL AUTO_INCREMENT,
  `name` VARCHAR(255) NOT NULL UNIQUE,
  `target_table` VARCHAR(255) NOT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB;

-- The new polymorphic table to hold all specification associations
CREATE TABLE `specifications` (
  `specifiable_id` INT NOT NULL COMMENT 'ID of the Job, Task, Talent, etc.',
  `specifiable_type` VARCHAR(255) NOT NULL COMMENT 'e.g., Job, Task, TalentProfile',
  `specification_type_id` INT NOT NULL,
  `specification_value_id` INT NOT NULL,
  PRIMARY KEY (`specifiable_id`, `specifiable_type`, `specification_type_id`),
  INDEX `idx_spec_type_value` (`specification_type_id`, `specification_value_id`),
  CONSTRAINT `fk_spec_type` FOREIGN KEY (`specification_type_id`) REFERENCES `specification_types`(`id`) ON DELETE CASCADE
) ENGINE=InnoDB;

Data Seeding (DML)

-- Populate the types directory with your specifications
INSERT INTO `specification_types` (id, name, target_table) VALUES 
(1, 'Source Language', 'source_languages'), 
(2, 'Target Language', 'target_languages'),
(3, 'Subject Matter', 'subject_matters');

Transactional Queries (DML)

-- INSERT specs for a NEW JOB (ID: 5000)
INSERT INTO `specifications` (specifiable_id, specifiable_type, specification_type_id, specification_value_id)
VALUES (5000, 'Job', 1, 10), (5000, 'Job', 3, 25);

-- UPDATE specs for an existing TASK (ID: 750)
-- First, remove old spec (e.g. old subject matter)
DELETE FROM `specifications` WHERE specifiable_id = 750 AND specifiable_type = 'Task' AND specification_type_id = 3;
-- Then, insert new one
INSERT INTO `specifications` (specifiable_id, specifiable_type, specification_type_id, specification_value_id)
VALUES (750, 'Task', 3, 30); -- New Subject Matter ID 30

Retrieval Queries (DQL)

-- Get all specs for a SINGLE JOB (ID: 5000)
SELECT st.name, s.specification_value_id
FROM specifications s JOIN specification_types st ON s.specification_type_id = st.id
WHERE s.specifiable_id = 5000 AND s.specifiable_type = 'Job';

-- COMPLEX REPORT: Find talents for a job
SELECT t.* FROM talent_profiles t WHERE EXISTS (
    SELECT 1 FROM specifications job_spec
    JOIN specifications talent_spec ON job_spec.specification_value_id = talent_spec.specification_value_id
    WHERE job_spec.specifiable_id = 5000 -- Job ID
      AND job_spec.specifiable_type = 'Job'
      AND job_spec.specification_type_id = 1 -- Source Language
      AND talent_spec.specifiable_type = 'TalentProfile'
      AND talent_spec.specifiable_id = t.id
);

Part 2: The 8-Week Migration Plan

Phase 1: Foundation & Core Logic (Weeks 1-3)

Establish the database schema and refactor the most complex features (Pricing, Talent Matching). The key to safety during this phase is the **Dual-Write Strategy**.

The Dual-Write Strategy

For a safe, zero-downtime migration, your application logic will be temporarily modified to write to **both** the old columns (e.g., `jobs.source_language_id`) and the new `specifications` table simultaneously. This creates a safety net, ensuring data is never lost and allowing you to switch your "read" queries to the new system with confidence while the old system continues to function as a backup. This is the key to a risk-free migration.

Phase 2: Full Feature & UI Migration (Weeks 4-7)

Methodically migrate the remaining 10+ business features and all associated UI components. Apply the dual-write strategy to `tasks`, `pricebook`, etc., as their features are refactored. Perform historical data backfills for these entities during this phase.

Phase 3: Finalization & Cleanup (Week 8)

After extensive testing proves the new system is stable, decommission the old structure. Remove all dual-write logic from the application code, then execute the `ALTER TABLE ... DROP COLUMN` scripts to remove the old, redundant fields from all affected tables.