Code Refactoring Strategies with Genspark: Practical Techniques for Incrementally Improving Legacy Code

🏚️ Fighting Legacy Code: The Initial Despair

A typical example of legacy code was a 20,000-line JavaScript file written three years ago. There were almost no comments, and function names like "func1" and "process2" were meaningless. Migration to TypeScript was considered, but no one had touched it due to the "too high risk" assessment.

💀 Characteristics of Legacy Code
Huge single file: All logic packed into a "main.js" file exceeding 20,000 lines
Abuse of global variables: Over 50 global variables interdependent on each other
No test code: Not a single unit test exists
Tight coupling: The fear of breaking other features with any change
Lack of documentation: The original author of the code has already left the company

However, through collaboration with Genspark, a phased refactoring approach has proven effective. This article shares the specific strategies and methods used.

🎯 Refactoring Strategy: A 5-Phase Approach

Phase 1: Current State Assessment and Visualization

The first step in refactoring is to understand the code. We asked Genspark to "analyze this codebase."

// Instruction to Genspark:
「Analyze the following files and extract the following information:
1. List of functions and their dependencies
2. Usage of global variables
3. Duplicate code patterns
4. Detection of circular dependencies
5. Code complexity (Cyclomatic Complexity)
6. Identification of potential bugs」

// Genspark's Analysis Results (excerpt)
/*
Analysis Results:
- Total functions: 127
- Global variables: 53
- Average function length: 82 lines
- Longest function: processData() (412 lines)
- Duplicate code: 23 patterns (total 1,200 lines duplicated)
- Circular dependencies: 12 locations
- High complexity functions: 18 (complexity > 20)

Top priority areas for correction:
1. processData() function (412 lines, complexity 42)
2. updateUI() function (387 lines, complexity 38)
3. Usage of global variable 'state' (referenced in 98 locations)
*/
            

✅ Utilizing Visualization Tools

When we asked Genspark to "generate a dependency diagram in Mermaid format," we were able to grasp the code structure at a glance.

```mermaid
graph TD
    A[main.js] --> B[processData]
    A --> C[updateUI]
    B --> D[fetchData]
    B --> E[validateData]
    B --> F[transformData]
    C --> G[renderTable]
    C --> H[renderChart]
    D --> I[API Client]
    E --> D
    F --> E
    G --> B
    H --> B
    style B fill:#ff6b6b
    style C fill:#ff6b6b
```
                

Red functions indicate problematic areas with circular dependencies.

Phase 2: Adding Test Code (Golden Master Testing)

Before refactoring, test code is needed to guarantee existing behavior. However, writing detailed unit tests for legacy code is unrealistic.

Here, we adopted the Golden Master Testing approach.

// Golden Master Testing Implementation
// Instruction to Genspark:
「Execute the existing processData function with various input patterns,
and create tests that save its output as a "Golden Master" (correct data).
After refactoring, we will verify that the same input yields the same output.」

import fs from 'fs';
import crypto from 'crypto';

interface GoldenMasterTest {
  name: string;
  input: unknown;
  output: unknown;
  hash: string;
}

class GoldenMasterTester {
  private goldenMasterPath = './tests/golden-masters.json';
  private goldenMasters: GoldenMasterTest[] = [];
  
  constructor() {
    if (fs.existsSync(this.goldenMasterPath)) {
      this.goldenMasters = JSON.parse(
        fs.readFileSync(this.goldenMasterPath, 'utf-8')
      );
    }
  }
  
  // First run: Record Golden Master
  record(name: string, fn: (input: unknown) => unknown, input: unknown) {
    const output = fn(input);
    const hash = this.hashOutput(output);
    
    this.goldenMasters.push({ name, input, output, hash });
    fs.writeFileSync(
      this.goldenMasterPath,
      JSON.stringify(this.goldenMasters, null, 2)
    );
    
    console.log(`✓ Golden Master recorded: ${name}`);
  }
  
  // After refactoring: Verify output
  verify(name: string, fn: (input: unknown) => unknown) {
    const master = this.goldenMasters.find(m => m.name === name);
    if (!master) {
      throw new Error(`Golden Master not found: ${name}`);
    }
    
    const output = fn(master.input);
    const hash = this.hashOutput(output);
    
    if (hash !== master.hash) {
      console.error('❌ Output changed!');
      console.error('Expected:', master.output);
      console.error('Actual:', output);
      throw new Error(`Golden Master verification failed: ${name}`);
    }
    
    console.log(`✓ Golden Master verified: ${name}`);
  }
  
  private hashOutput(output: unknown): string {
    const json = JSON.stringify(output, Object.keys(output).sort());
    return crypto.createHash('sha256').update(json).digest('hex');
  }
}

// Example usage: Recording Golden Master
const tester = new GoldenMasterTester();

// Record various patterns before refactoring
tester.record('processData-empty', processData, []);
tester.record('processData-single', processData, [{ id: 1, name: 'Test' }]);
tester.record('processData-multiple', processData, [
  { id: 1, name: 'Test1' },
  { id: 2, name: 'Test2' },
]);
tester.record('processData-special-chars', processData, [
  { id: 1, name: 'Test<>"' },
]);
tester.record('processData-large-dataset', processData, generateLargeDataset(1000));

// Verify after refactoring
tester.verify('processData-empty', processDataRefactored);
tester.verify('processData-single', processDataRefactored);
// ...and so on, verify all patterns
            

💡 Benefits of Golden Master Testing

Rapid adoption: Can test without understanding the internal structure of existing code
Comprehensive coverage: Verifies the overall behavior of functions
Refactoring safety net: Guarantees that the output has not changed
Regression detection: Immediately discovers unintended behavior changes

Phase 3: Accumulation of Small Improvements (Strangler Fig Pattern)

Refactoring everything at once is dangerous. We adopted the Strangler Fig Pattern to gradually replace code with new implementations.

// Step 1: Create a new function that wraps the existing function
// Instruction to Genspark:
「Create a new processDataV2 function that wraps the processData function.
Internally, it should initially call processData, and gradually replace it with a new implementation.」

// Existing code (untouched)
function processData(data) {
  // ...412 lines of complex logic
}

// New wrapper function
function processDataV2(data) {
  // Gradually switch with feature flags
  if (featureFlags.useNewProcessor) {
    // New implementation (gradually expanding)
    const validated = validateDataV2(data);
    if (validated) {
      return transformDataV2(validated);
    }
    // Fallback: Use the old implementation if the new one cannot process
    console.warn('Falling back to legacy processData');
  }
  
  // Call existing implementation
  return processData(data);
}

// Step 2: Partially replace with new implementation
function processDataV2(data) {
  // Data validation part migrated to new implementation
  const validated = validateDataV2(data);
  
  if (featureFlags.useNewTransformer) {
    return transformDataV2(validated);
  }
  
  // Transformation logic still uses the old implementation
  return processData(validated);
}

// Step 3: Fully migrate to new implementation
function processDataV2(data) {
  const validated = validateDataV2(data);
  const transformed = transformDataV2(validated);
  return enrichDataV2(transformed);
}

// Step 4: Delete old implementation and restore function name
// Rename processDataV2 → processData
// Delete old processData
            

🎉 Effects of the Strangler Fig Pattern

What was achieved with this approach:

Zero downtime: Refactoring while continuously operating in production
Phased releases: Verification at each stage with Golden Master Tests
Instant rollback: If an issue arises, revert to the old implementation immediately with feature flags
Team-wide peace of mind: Freed from the fear of "it might break"

Phase 4: Dependency Organization and Modularization

We reduced global variables and implemented appropriate Dependency Injection (DI).

// ❌ Before refactoring: Dependent on global variables var globalState = { userData: null, config: {}, cache: {}, }; function fetchUserData(userId) { // Directly references global variables if (globalState.cache[userId]) { return globalState.cache[userId]; } const data = fetch(`/api/users/${userId}`); globalState.cache[userId] = data; globalState.userData = data; return data; } function updateUI() { // Directly references global variables const user = globalState.userData; document.getElementById('username').textContent = user.name; } // ✅ After refactoring: Dependency Injection // Instruction to Genspark: 「Eliminate dependencies on global variables and implement using the dependency injection pattern. Each class should explicitly receive necessary dependencies in its constructor.」 interface UserRepository { getUser(userId: string): Promise; saveUser(user: User): Promise; } interface CacheService { get(key: string): T | null; set(key: string, value: T): void; } // Repository implementation class UserRepositoryImpl implements UserRepository { constructor( private apiClient: ApiClient, private cache: CacheService ) {} async getUser(userId: string): Promise { // Cache check const cached = this.cache.get(`user:${userId}`); if (cached) { return cached; } // API call const user = await this.apiClient.get(`/users/${userId}`); // Save to cache this.cache.set(`user:${userId}`, user); return user; } async saveUser(user: User): Promise { await this.apiClient.put(`/users/${user.id}`, user); // Update cache this.cache.set(`user:${user.id}`, user); } } // UI update class class UserProfileUI { constructor( private userRepository: UserRepository, private container: HTMLElement ) {} async render(userId: string): Promise { const user = await this.userRepository.getUser(userId); this.container.innerHTML = `

${user.name}

${user.email}

`; } } // Assembling dependencies (Dependency Injection Container) class DIContainer { private cache: CacheService; private apiClient: ApiClient; private userRepository: UserRepository; constructor() { // Create singleton instances this.cache = new InMemoryCacheService(); this.apiClient = new ApiClient('/api'); this.userRepository = new UserRepositoryImpl(this.apiClient, this.cache); } getUserRepository(): UserRepository { return this.userRepository; } createUserProfileUI(container: HTMLElement): UserProfileUI { return new UserProfileUI(this.userRepository, container); } } // Example usage const container = new DIContainer(); const userProfileUI = container.createUserProfileUI( document.getElementById('profile-container') ); userProfileUI.render('user-123');

🔧 Benefits of Dependency Organization

Testability: Easy unit testing by injecting mocks/stubs
Reusability: The same logic can be reused in different environments
Maintainability: Explicit dependencies clarify the scope of change
Flexibility: Easy to swap implementations (e.g., in-memory cache → Redis cache)

Phase 5: TypeScript Conversion and Type Safety Improvement

As a final step, we carried out the migration from JavaScript to TypeScript.

// Instruction to Genspark: 「Convert the following JavaScript code to TypeScript. Use strict type definitions and avoid using the any type. Utilize utility types (Partial, Readonly, Pick, etc.) as needed.」 // JavaScript (no type information) function mergeUserData(existing, updates) { return { ...existing, ...updates, updatedAt: new Date(), }; } // TypeScript (type-safe) interface User { id: string; name: string; email: string; role: 'admin' | 'user' | 'guest'; createdAt: Date; updatedAt: Date; } // Ensure immutability with Readonly type UserUpdate = Partial>; function mergeUserData( existing: Readonly, updates: UserUpdate ): User { return { ...existing, ...updates, updatedAt: new Date(), }; } // Detect type errors during compilation const user: User = { id: '123', name: 'John', email: '[email protected]', role: 'user', createdAt: new Date('2023-01-01'), updatedAt: new Date('2023-01-01'), }; // ✅ OK: Only allowed fields are updated const updated = mergeUserData(user, { name: 'Jane' }); // ❌ Compile error: id cannot be updated // const invalid = mergeUserData(user, { id: '456' }); // ❌ Compile error: Non-existent field // const invalid2 = mergeUserData(user, { age: 30 }); // ❌ Compile error: role only accepts limited values // const invalid3 = mergeUserData(user, { role: 'superuser' });

✨ Measuring the Effects of TypeScript Migration

Effects after 3 months of refactoring completion:

Bug reduction: Significant decrease in bug occurrence rate in production
Increased development speed: Significantly shortened lead time for new feature development
Code lines: Reduced from 20,000 lines to 12,000 lines (duplicate removal)
Test coverage: Significantly improved from 0%
Team confidence: Zero instances of "I'm scared to change the code"

🤖 Effective Ways to Utilize Genspark

Refactoring Task Instruction Templates

// Pattern 1: Function Splitting
「The following function is too complex. According to the Single Responsibility Principle,
please split it into functions of appropriate size. Each function must meet the following conditions:
- Max 20 lines per function
- Cyclomatic complexity <= 10
- Have only one clear responsibility
- Give meaningful function names」

// Pattern 2: Integrating Duplicate Code
「Detect duplicate logic from the following set of files and
extract it as a common function. When extracting:
- Generic parameterization
- Set appropriate default arguments
- Include usage examples in JSDoc comments」

// Pattern 3: Simplifying Conditional Branches
「Rewrite the following nested conditional branches into a flatter structure
using early return patterns and guard clauses. Prioritize readability.」

// Pattern 4: Adding Error Handling
「Add comprehensive error handling to the following function:
- Input validation
- try-catch blocks
- Appropriate error messages
- Log output
- Retry functionality (if necessary)」

// Pattern 5: Improving Asynchronous Processing
「Rewrite the following callback hell with async/await.
Optimize concurrent operations with Promise.all.」
            

💡 Tips for Pair Refactoring with Genspark

Start small: Begin with one function at a time, gradually expanding the scope
Thorough verification: Run tests at each step to confirm functionality
Clarify intent: Communicate "why" refactoring is being done
Specify constraints: Explicitly state performance requirements and compatibility requirements
Iterative review: Always have human review of generated code

📊 Refactoring Progress Management

Visualize Code Quality with Metrics

We asked Genspark to "create a script to measure code quality metrics" and quantitatively tracked our progress.

// code-metrics.ts
import { readFileSync } from 'fs';
import { glob } from 'glob';

interface CodeMetrics {
  totalLines: number;
  codeLines: number;
  commentLines: number;
  functionCount: number;
  averageFunctionLength: number;
  maxFunctionLength: number;
  complexFunctions: number; // complexity > 10
  testCoverage: number;
}

async function calculateMetrics(pattern: string): Promise {
  const files = await glob(pattern);
  let totalLines = 0;
  let codeLines = 0;
  let commentLines = 0;
  let functionCount = 0;
  let functionLengths: number[] = [];
  
  for (const file of files) {
    const content = readFileSync(file, 'utf-8');
    const lines = content.split('\n');
    
    totalLines += lines.length;
    
    for (const line of lines) {
      const trimmed = line.trim();
      if (trimmed.startsWith('//') || trimmed.startsWith('/*')) {
        commentLines++;
      } else if (trimmed.length > 0) {
        codeLines++;
      }
    }
    
    // Function detection and line count
    const functionRegex = /function\s+\w+|const\s+\w+\s*=\s*\(/g;
    const functions = content.match(functionRegex) || [];
    functionCount += functions.length;
    
    // ...Calculate function length
  }
  
  return {
    totalLines,
    codeLines,
    commentLines,
    functionCount,
    averageFunctionLength: functionLengths.reduce((a, b) => a + b, 0) / functionCount,
    maxFunctionLength: Math.max(...functionLengths),
    complexFunctions: 0, // To be implemented
    testCoverage: 0, // Get from jest coverage
  };
}

// Record metrics weekly
async function trackProgress() {
  const metrics = await calculateMetrics('src/**/*.ts');
  console.log('=== Code Quality Metrics ===');
  console.log(`Total Lines: ${metrics.totalLines}`);
  console.log(`Code Lines: ${metrics.codeLines}`);
  console.log(`Comment Rate: ${(metrics.commentLines / metrics.totalLines * 100).toFixed(1)}%`);
  console.log(`Function Count: ${metrics.functionCount}`);
  console.log(`Avg Function Length: ${metrics.averageFunctionLength.toFixed(1)} lines`);
  console.log(`Max Function Length: ${metrics.maxFunctionLength} lines`);
  console.log(`Test Coverage: ${metrics.testCoverage}%`);
}
            

💫 Summary: Refactoring is a Continuous Improvement Process

Refactoring legacy code is not a one-time event but a continuous improvement process.

Through collaboration with Genspark, the development team proceeded with refactoring with the attitude of "not aiming for perfection, but making the code better than yesterday." As a result, the entire team became positive about improving code quality, and a culture emerged where they tackled technical debt without fear.

The 5-phase approach introduced in this article (current state assessment → add tests → gradual improvement → modularization → improve type safety) is a universal strategy applicable to any project.

What you can do starting today: First, choose one of your most frequently changed files and ask Genspark to "Analyze the problems in this file." Those analysis results will be the first step in your refactoring journey.