Will GPT-engineer Speed Up Your Refactoring? An In-Depth Guide

May 7, 2024 • 26 min read

Will the GPT-engineer speed up your refactoring? Our investigation revealed it didn't quite excel in this domain. Delve into the details to understand why.

Improved efficiency, better software performance, and lower maintenance costs are just some of the many benefits of refactoring. Yet, the process itself is tedious and quite time-consuming. Perfect for giving it away to AI-powered tools like GPT-engineer.

The real question is: can implementing the GPT-engineer help you streamline the process?

This guide will help you find the answer.

Key takeaways:

GPT-engineer was tested by our team for its potential in speeding up refactoring.
We evaluated its performance in common tasks such as renaming variables, methods, and classes, and extracting methods, interfaces, and subclasses.
GPT-engineer handles writing new code much better than it does refactoring and understanding existing code.

What is code refactoring?

Code refactoring is a process of improving existing code without changing its external behavior.

By restructuring the code, you can refine the design, structure, or implementation of the software while preserving its original functionality. This technique helps improve the nonfunctional attributes of the software, such as its readability, maintainability, and complexity.

Why you should consider refactoring your code

Now, if the functionality stays the same, you may wonder: why should I refactor the code? There are many advantages to refactoring, including:

Better work efficiency: Clearer code equals reduced time and effort needed for onboarding new developers, as they can quickly understand the entire codebase.
Lower maintenance costs: Clarification of the code leads to easier identification and resolution of bugs or any other issues.
Increased productivity: With less time spent on deciphering the code, developers can also have more time for implementing new features.
Enhanced agility: Easily readable code allows for quicker modifications and iterations, which is crucial in responding to changes in business requirements.
Improved quality of the product: Refactoring the code can also enhance the software's performance, leading to higher customer satisfaction.

Most common refactoring activities

The process of refactoring can consist of several different actions. Below, we've listed some of the most frequent activities:

Renaming a variable: Giving a new name to an existing code variable
Renaming a method: Changing the name of a function or a procedure in the code
Renaming a class: Altering the name of a class in the code
Moving a method: Transferring a function to a different location within the codebase
Moving a class: Relocating a class to a different part of the codebase
Extracting a method: Creating a new function from a part of an existing one
Inlining a method: Incorporating the functionality of a function or a procedure directly into its calling code
Extracting a class: Creating a new class from a part of an existing one
Extracting an interface: Defining a new interface based on common behaviors or methods of existing classes
Extracting a superclass: Creating a new parent class to hold common attributes or methods shared by multiple existing classes
Pulling up a method: Moving a method from a subclass to its superclass to promote code reusability
Pushing down a method: Moving a method from a superclass to one or more of its subclasses to better align functionality with subclass responsibilities

What is the GPT-engineer?

In short, the GPT-engineer is an AI-powered app builder that translates a project description into a ready-to-use codebase. The solution is based on GPT models and can convert natural language into code, execute it, and implement improvements in existing projects.

The premise of the GPT-engineer platform is that users no longer have to write code from scratch. Instead, programmers can describe the project, and in return, the GPT-engineer will generate the entire codebase.

Yet, the GPT-engineer is not justa tool for the code generation process. It also acts as software developers' AI coding assistant, helping them with their daily tasks.

How does the GPT-engineer work

The GPT-engineer is a tool that leverages artificial intelligence and machine learning to assist you with coding. The workflow of the GPT-engineer is very simple and consists of three elements:

1. Defining the prompt

The first step in using the GPT-engineer is creating a prompt, which includes any specifications or requirements related to your project, allowing the AI tool to generate code.

2. Generating code and/or codebase

After that, the GPT-engineer analyzes your request and generates code snippets, functions, or the entire codebase based on the prompted tasks. Optionally, you may also need to add supplementary answers.

3. Improving the generated code

Now, software developers need to adapt the new code. Although the result serves as a solid starting point, finalizing the code to meet all requirements remains a human task.

Installing the GPT-engineer: Step-by-step

For stable release

Step 1: Install gpt-engineer

For this use python -m pip install gpt-engineer

Step 2: Set API Key

Enter your API key. There are several ways to do this, all of which are explained in the official documentation.

Step 3: Run the GPT-engineer

Now, you're ready to run the program. Create a file called prompt in your project directory and gpte <project_dir> or gpte <project_dir> -i after writing the instructions.

For development

Step 1: Clone the GPT-engineer GitHub repository

Clone the GPT-engineer GitHub repository.

Step 2: Set API Key

Enter your API key. There are several ways to do this, all of which are explained in the official documentation.

Step 3: Set up the GPT-engineer

Following that, you'll need to navigate to the cloned directory using the 'cd' command. Then, install all necessary dependencies and activate the virtual environment with the following commands:

poetry install
poetry shel

Step 4: Run the GPT-engineer

Now, you're ready to run the program. Create a file called prompt in your project directory and gpte <project_dir> or gpte <project_dir> -i after writing the instructions.

Using the GPT-engineer for refactoring: A case study

Having learned the benefits of the tool, we decided to test the GPT-engineer's capabilities ourselves to determine whether the solution is as good as it promises to be.

The results? They were quite surprising, but let's start from the beginning.

The GPT-engineer in refactoring: Chosen methodology of study

To measure the effectiveness of the GPT-engineer in the case of refactoring, we decided to perform the most common refactoring activities manually (by a person) and with the help of the tool.

We ensured that the code exposed to refactoring had 100% test coverage to confirm that the changes made did not disrupt the application logic.

The GPT-engineer in refactoring: Tested use cases

Our team set out to test the GPT-engineer in the following use cases: renaming a variable, renaming a method, renaming a class, extracting a method, extracting an interface, and extracting a subclass.

The GPT-engineer in refactoring: Application under study

Project description: A simple CRUD application for products, where each product has two properties – a name and a quantity

Technologies in use: Node.js, Typescript, express, and Joi

Testing: The app had a 100% test coverage achieved exclusively via integration testing (thanks to this we would know whether refactoring is changing the business logic); all tests were written using Jest and Supertest

Test results:

 PASS  src/controllers/productsController.test.ts (5.197 s)
  Products API
    POST /api/products
      ✓ should create a new product (204 ms)
      ✓ should return 400 if product parameters are invalid (8 ms)
    GET /api/products
      ✓ should list all products (9 ms)
      ✓ should return 400 if pagination parameters are invalid (6 ms)
    GET /api/products/:id
      ✓ should get a product by id (7 ms)
      ✓ should return 404 if product not found (6 ms)
      ✓ should return 400 if product id is invalid (6 ms)
    PUT /api/products/:id
      ✓ should update a product by id (8 ms)
      ✓ should return 404 if product not found (7 ms)
      ✓ should return 400 if product parameters are invalid (6 ms)
      ✓ should return 400 if product id is invalid (6 ms)
    DELETE /api/products/:id
      ✓ should delete a product by id (6 ms)
      ✓ should return 404 if product not found (4 ms)
      ✓ should return 400 if product id is invalid (15 ms)

------------------------|---------|----------|---------|---------|-------------------
File                    | % Stmts | % Branch | % Funcs | % Lines | Uncovered Line #s 
------------------------|---------|----------|---------|---------|-------------------
All files               |     100 |      100 |     100 |     100 |                   
 src                    |     100 |      100 |     100 |     100 |                   
  server.ts             |     100 |      100 |     100 |     100 |                   
 src/controllers        |     100 |      100 |     100 |     100 |                   
  productsController.ts |     100 |      100 |     100 |     100 |                   
 src/entity             |     100 |      100 |     100 |     100 |                   
  ProductEntity.ts      |     100 |      100 |     100 |     100 |                   
 src/errors             |     100 |      100 |     100 |     100 |                   
  ApplicationError.ts   |     100 |      100 |     100 |     100 |                   
 src/repositories       |     100 |      100 |     100 |     100 |                   
  productsRepository.ts |     100 |      100 |     100 |     100 |                   
 src/routes             |     100 |      100 |     100 |     100 |                   
  productsRouter.ts     |     100 |      100 |     100 |     100 |                   
 src/services           |     100 |      100 |     100 |     100 |                   
  productsService.ts    |     100 |      100 |     100 |     100 |                   
 src/validators         |     100 |      100 |     100 |     100 |                   
  productsValidator.ts  |     100 |      100 |     100 |     100 |                   
------------------------|---------|----------|---------|---------|-------------------
Test Suites: 1 passed, 1 total
Tests:       14 passed, 14 total
Snapshots:   0 total
Time:        5.684 s, estimated 7 s

Application architecture: A standard layered structure, presented in the diagram below

Application architecture

Please note that the repository is small and consists of only 350 lines of code, including an integration test file that is 114 lines long.

Endpoints:

HTTP Method	Path	Description
POST	/api/products	creates a product
GET	/api/products	returns array of products, supports pagination
GET	/api/products/:id	returns specific product by ID, throws a 404 error, if the given ID does not exist
PUT	/api/products/:id	updates the whole product by id, throws a 404 error, if the given ID does not exist
DELETE	/api/products/:id	removes product by ID, throws a 404 error, if the given ID does not exist

The GPT-engineer in refactoring: Examining the results

So, how did the GPT-engineer tool perform? Unfortunately, in most cases, proposed changes failed to pass the tests.

Renaming operations

Renaming variables

We started with renaming the variables. As the current IDEs already offer features that automate renaming a single variable over many files, we decided to try something a bit harder.

We created a case in which we no longer store products, but currently available products, thus we want to change each product reference in the entire codebase to availableProduct.

Here's the prompt that we used and the results of the test.

Parameters	Outcomes
Prompt	Rename each product variable to availableProduct. For example, when a product is returned from a repository, use availableProduct rather than product. Do this in all scenarios in all files.
Result	Tests don’t pass after changes.
Cost	It took 5 minutes and cost $2.8 to execute the prompt and make changes through GPT-engineer.

Our first attempt ended in failure. Either the prompt was not effective enough or the task was too challenging for the current version of the tool.

In most cases, it didn’t update variable names, and when it did, the code ended up with multiple syntax errors. Since the tested codebase was quite small, the execution seemed to be slow and expensive.

Below, you’ll find a few output examples.

In this part of the code, GPT-engineer renamed the variable from product to availableProduct in line six but failed to do so in line number two, resulting in an invalid reference.

createProduct = async (req: Request, res: Response) => {
  const product = await this.productsService.createProduct(req.body);
    
  res.status(201).json({
    httpStatusCode: 201,
    data: availableProduct,
    message: "Product created successfully",
  });
};

Here, the function declaration has been duplicated, which is a very common occurrence. Often, we needed to fix the missing curly brackets, which can be quite time-consuming.

async update(product: ProductEntity): Promise<availableProduct | null> {
async update(product: ProductEntity): Promise<ProductEntity | null> {
  const index = this.products.findIndex(
  (product) => product.id === product.id
    );
    
  // ...
}

In this rare instance, the modified code was correct. However, the approach was unusual — rather than renaming the variable in line number two, it was reassigned in line three.

getProductById = async (req: Request, res: Response) => {
  const product = await this.productsService.getProductById(req.params.id);
  const availableProduct = product;

  res.status(200).json({
    httpStatusCode: 200,
    data: availableProduct,
    message: "Product returned successfully",
  });
};

In summary, the AI tool renamed variables very rarely and inconsistently, breaking the code in almost every instance. Overall, less than 5% of the proposed changes were acceptable.

Renaming the method

After that, we set on trying something simpler. This time, we asked the AI tool to rename all methods from "camel case" to "snake case".

Parameters	Outcomes
Prompt	Please change method names from camel case to snake case.
Result	Tests don’t pass after changes.
Cost	It took 4 minutes and cost $0.21 to execute the prompt and make changes through GPT-engineer.

Occasionally, the GPT-engineer literally added snake_case to the method names.

async create_snake_case(product: ProductEntity): Promise<ProductEntity> {
  this.products.push(product);
  
  return product;
}

However, here, for example, it unnecessarily renamed the constructor method.

export class ProductEntity {
  constructor_snake_case(
    public id: string,
    public name: string,
    public quantity: number
  ) {}
}

In many cases, the tool correctly renamed methods in declarations but did not update the names when calling the same methods. It ended up renaming findById to find_by_id, but still referenced findById in other files.

We tried to apply more descriptive prompts to achieve this goal, but it didn't make much of a difference. We also noticed some other unexpected results, such as adding No change comments throughout the entire codebase.

export class ProductEntity {
  constructor(
    public id: string, // No change
    public name: string, // No change
    public quantity: number // No change
  ) {}
}

All in all, using the GPT-engineer for this action was not very helpful as its proposals often introduced obvious syntax errors and were very inconsistent.

Renaming the class

We moved on to another test. We wanted every class in our code to end with the word "Class", and here's how it went.

Parameters	Outcomes
Prompt	Please add "Class" word at the end of each class. For instance ProductsController rename to ProductsControllerClass, ProductEntity to ProductEntityClass. Do it with each class name.
Result	Tests don’t pass after changes.
Cost	It took 55 seconds and cost $0.16 to execute the prompt and make changes through GPT-engineer.

The execution of the prompt was fast and inexpensive, but the result was not satisfactory. In this case, the mistakes were similar to previous examples—lack of consistency and invalid syntax.

Extracting a method

This time, we attempted to use the GPT-engineer to refactor the code by extracting the UUID generation into a separate method.

We aimed to transform the following:

async createProduct(input: {
  name: string;
  quantity: number;
}): Promise<ProductEntity> {
  const product = new ProductEntity(randomUUID(), input.name, input.quantity);

  return this.productsRepository.create(product);
}

Into this:

generateIdentifier(): string {
  return randomUUID();
}

async createProduct(input: {
  name: string;
  quantity: number;
}): Promise<ProductEntity> {
  const product = new ProductEntity(
    this.generateIdentifier(),
    input.name,
    input.quantity
  );

  return this.productsRepository.create(product);
}

What were the outcomes?

Parameters	Outcomes
Prompt	Please extract from the create method in productService generation of randomUUID. Extract this to a method called generateIdentifier and use it in the create method. generateIdentifier should not receive any parameters and should return the uuid generated by randomUUID function.
Result	Tests don’t pass after changes.
Cost	It took 5 seconds and cost $0.04 to execute the prompt and make changes through gpt-engineer.

The result was close but not functional. The tool successfully extracted the method, but the call to the repository on line 14 was unnecessary and disrupted the code.

async createProduct(input: {
  name: string;
  quantity: number;
}): Promise<ProductEntity> {
  const identifier = this.generateIdentifier();
  const product = new ProductEntity(identifier, input.name, input.quantity);
  return this.productsRepository.create(product);
}
private generateIdentifier(): string {
  return randomUUID();
  return this.productsRepository.create(product);
}

Changing the code manually in one place is simpler than typing a prompt and correcting the result later. Our team had GitHub Copilot integrated with their IDE, and, in this case, typing generateIdentifier() mainly involves accepting the suggested lines by pressing the Tab key a few times.

Extracting the interface

Moving on, we asked the AI tool to extract the interface from ProductEntity.ts. Our goal was to create an interface like this:

export interface ProductInterface {
  id: string;
  name: string;
  quantity: number;
}

And then, make ProductEntity.ts implement a newly created interface.

export class ProductEntity implements ProductInterface {
  constructor(
    public id: string,
    public name: string,
    public quantity: number
  ) {}
}

Surprisingly, we had our first success.

Parameters	Outcomes
Prompt	Create interface for ProductEntity class. Make ProductEntity implement it. Put interface next to ProductEntity.ts file and name it ProductInterface.ts.
Result	Tests pass after changes.
Cost	It took 10 seconds and cost $0.04 to execute the prompt and make changes through GPT-engineer.

The GPT-engineer AI tool correctly created the interface.

However, it did not implement it in ProductEntity. Instead, it only removed the last line of the file in ProductEntity.

export interface IProduct {
  id: string;
  name: string;
  quantity: number;
}

// This interface defines the structure of a product entity.

The overall outcome was close, but making this change by hand would be much simpler and faster.

Extracting the superclass

Lastly, we wanted to check how the AI tool will handle extracting common functionalities of ProductEntity and PersonEntity into another class and then sharing it by inheritance.

The common elements between those two classes were: id, name, and the returnIdAndName method. What we expected was that these would be moved to the superclass.

export class ProductEntity {
  constructor(
    public id: string,
    public name: string,
    public quantity: number
  ) {}

  returnIdAndName(): string {
    return `ID: ${this.id}, Name: ${this.name}`;
  }
}

export class PersonEntity {
  constructor(
    public id: string,
    public name: string,
  ) {}

  returnIdAndName(): string {
    return `ID: ${this.id}, Name: ${this.name}`;
  }
}

Here's what we ended up with.

Parameters	Outcomes
Prompt	Extract common features of PersonEntity.ts class and ProductEntity class to another class. Name this class RecordEntity. PersonEntity and ProductEntity should extend RecordEntity and inherit shared features. Please remove from PersonEntity and ProductEntity features extracted to RecordEntity.
Result	Tests don’t pass after changes.
Cost	It took 7 seconds and cost $0.10 to execute the prompt and make changes through GPT-engineer.

RecordEnitity looked good. AI accurately identified the duplicate functionality that needed to be extracted.

export abstract class RecordEntity {
  constructor(public id: string, public name: string) {}

  returnIdAndName(): string {
    return `ID: ${this.id}, Name: ${this.name}`;
  }
}

PersonEntity had some extra curly brackets, but, apart from that, was correct.

import { RecordEntity } from './RecordEntity';

export class PersonEntity extends RecordEntity {
  }
}

However, ProductEntity remained unchanged, which was not the expected result.

export class ProductEntity {
  constructor(
    public id: string,
    public name: string,
    public quantity: number
  ) {}

  returnIdAndName(): string {
    return `ID: ${this.id}, Name: ${this.name}`;
  }
}

The outcome of using the GPT-engineer in this scenario was somewhat correct but did require some adjustments.

The GPT-engineer in refactoring: Our conclusion

All things considered, the use of the GPT-engineer failed to speed up the chosen refactoring activities.

In cases where the tool had the opportunity to shine, such as when changes had to be made to multiple files across the project, it failed and created messy, inconsistent code.

When it came to small and local changes, selecting the files that needed to be changed, creating the prompt, and correcting the inaccurate result took more time than writing the code by hand or using alternatives such as Github Copilot.

Limitations and things to consider while implementing the GPT-engineer

While looking to raise the cost- and time-effectiveness of the refactoring process, implementing the GPT-engineer might not be the best choice just yet. At the moment, the solution still requires a lot of manual corrections and doesn't produce satisfactory results.

The code that the system generates needs the user to check and adapt it. The tool doesn't function as an independent refactoring AI agent, and it is actually better at building a foundation for software development than refactoring.

Nevertheless, the potential for refactoring that it brings to the users, once fulfilled, will make it a very interesting extension to the developers' team. For the time being, it is still a great tool for tasks such as setting up the project structures or generating boilerplate code.

Transform with bespoke software

Create solutions that grow with you

Get started!

Key takeaways:

What is code refactoring?

Why you should consider refactoring your code

Most common refactoring activities

What is the GPT-engineer?

How does the GPT-engineer work

1. Defining the prompt

2. Generating code and/or codebase

3. Improving the generated code

Installing the GPT-engineer: Step-by-step

For stable release

Step 1: Install gpt-engineer

Step 2: Set API Key

Step 3: Run the GPT-engineer

For development

Step 1: Clone the GPT-engineer GitHub repository

Step 2: Set API Key

Step 3: Set up the GPT-engineer

Step 4: Run the GPT-engineer

Using the GPT-engineer for refactoring: A case study

The GPT-engineer in refactoring: Chosen methodology of study

The GPT-engineer in refactoring: Tested use cases

The GPT-engineer in refactoring: Examining the results

Renaming operations

Renaming variables

Renaming the method

Renaming the class

Extracting a method

Extracting the interface

Extracting the superclass

The GPT-engineer in refactoring: Our conclusion

Limitations and things to consider while implementing the GPT-engineer

Transform with bespoke software

Scaling IT Teams Globally: Procurement’s Role in GCC Enablement

Reactive vs Proactive Management: The Real Cost to Your Business In 2025

What is Software Outsourcing? Guide for 2025

8 Top Python Web App Examples From Top-Notch Companies

Python for Software Development: Skills, Tools, and Tips for Success

How to Make Your Fintech ADA Compliant: A Step-by-Step Guide for 2025