Windsurf Cascade: Guide and Best Practices

Recently, I’ve been working with Cascade Windsurf integrated into IntelliJ to develop an AI-driven pilot project while testing different techniques. Cascade is an advanced AI assistant that offers capabilities such as autonomous execution of actions and implementation of complete solutions, a deep understanding of project context, specialized tools, true multitasking, and persistent memory that stores important steps taken during task execution.

In this post, I’ll describe my experience with this tool over the last 3 months. In the next one, I’ll talk about using the interface, its rules, and the working modes (chat and code).

We’ll cover models, capabilities, model usage modes (“work models”), planning strategies… A complete guide showing how we’ve used this powerful tool—always with the goal of improving and embracing new ways and paradigms.

What Is Windsurf Cascade?

Cascade is an agentic AI assistant developed by Windsurf, specifically designed for pair programming and software development. Unlike a simple code autocomplete, Cascade is an autonomous agent capable of:

Understanding full context: it analyzes your project, open files, code structure, and architectural decisions.
Executing actions: it doesn’t just suggest code— it can edit files, run commands, navigate the project, and validate changes.
Reasoning and planning: it breaks down complex tasks into steps, researches solutions, and proposes strategies before implementing them.
Learning from the project: it uses a persistent memory system to remember rules, patterns, and decisions specific to your codebase.

In essence, it’s a development partner that understands both the code and your project context.

Key Features

1 Deep IDE Integration

Cascade integrates natively with IDEs like IntelliJ IDEA and VS Code, providing:

Access to open files and cursor position.
Ability to edit multiple files simultaneously.
Execution of system commands (Maven, npm, git, etc.).
Intelligent project navigation.

2 Multiple AI Models

You can switch between different models depending on the task:

Gemini Pro 2.5 for balanced speed/quality.
Claude Sonnet 4 for maximum precision.
Claude Sonnet 3.7 (Thinking) for visible reasoning.
GPT‑5 for advanced capabilities.
(New) Claude Sonnet 4.5 (Thinking) for deeper internal reasoning, better planning for complex tasks, and higher accuracy. Extremely powerful.
Etc.

3 Persistent Memory System with a Memory Bank

Stores architectural decisions.
Remembers project-specific rules.
Keeps context across conversations.
Avoids repeating explanations.

4 Productivity Tools

Semantic search across the codebase.
Plans and TO‑DO lists for complex tasks.
Safe command execution with user confirmation.
Automatic test validation.
Advanced search (grep_search) for identifying code patterns.
Integration with MCP servers (Cloud Run, Context7) for up‑to‑date documentation.

5 Agentic Capabilities

Autonomous research. It looks up best practices before implementing solutions.
Dependency analysis. Understands change impact (not always perfectly).
Intelligent refactoring. Updates tests and docs automatically (not always flawlessly).
Systematic debugging. Reproduces errors, analyzes root causes, and proposes solutions (this part is amazing— it solves issues by reading logs or adding its own debugging logs extremely quickly).

Working Modes with Cascade

After 3 months working with this tool, I’ve identified several modes of interaction with Cascade depending on the nature of the task:

Research Mode

This is the mode I use when I don’t know the best solution:

"Research the standard pattern for handling refresh tokens in Spring Boot. Explain the options to me and recommend one before implementing."

Implementation Mode

In this case, I use it for clear and well‑defined tasks:

"Create a POST /api/orders endpoint that receives an OrderDTO, validates the data, and saves the order in the database using OrderService."

Debugging Mode

This would be the interaction mode to resolve errors:

"I’m getting a 401 error on /api/orders. Stack trace attached. Check JwtAuthenticationFilter and SecurityConfig."

Architecture Mode

For design decisions:

"I want to unify the User and AdminUser entities. Analyze the impact and propose a step-by-step migration strategy."

Learning Mode

This mode can be used to understand existing code:

"Explain to me how the refresh token system we implemented works. How does it prevent race conditions?"

Validation Mode

For post-implementation, we would say:

"Run the E2E tests to verify that the checkout flow works correctly."

Which AI Models Have We Used?

One of Cascade's most powerful features is the ability to choose between different AI models depending on the nature of the task. During the pilot development, I primarily worked with the following models:

Gemini Pro 2.5

Primary use: one of my most used models for general and complex tasks.
Strengths:
- Excellent balance between speed and quality.
- Very good at understanding multilingual context (Spanish/English).
- Deep reasoning capabilities.
- Clean and well-structured code generation.
Use cases: architectural refactorings, implementation of complex features, requirements analysis.

Claude Sonnet 4

Primary use: tasks requiring maximum precision and detailed analysis.
Strengths:
- Excellent for analyzing existing code and architecture.
- Highly precise in understanding technical requirements.
- Extremely careful handling of critical changes.
- Superior in technical documentation and detailed explanations.
Use cases: code security reviews (JWT, authentication), deep architectural analysis, complex system refactorings (User/AdminUser).

Claude Sonnet 3.7 (Thinking)

Primary use: complex problems that require step-by-step reasoning.
Strengths:
- Visible and structured "thinking" process.
- Excellent for debugging tough issues.
- Methodical root cause analysis.
- Very good at exploring different approaches.
Use cases: debugging race conditions (refresh tokens), resolving Kafka errors, deserialization issues.

GPT-5

Primary use: cutting-edge tasks requiring advanced capabilities.
Strengths:
- Most advanced reasoning capabilities.
- Excellent understanding of complex context.
- Very good for problems that require technical creativity.
- High-quality code generation.
Use cases: advanced optimizations, solving unusual problems. So far, it's the one I've used the least and has convinced me the least.

Claude Sonnet 4.5 (Thinking) and 4.5 (New, October ‘25)

At the time of writing, I’ve tested it for a Mapper to MapStruct migration, where it structured a very comprehensive plan with estimates, potential problems and solutions, code examples, dependency injection strategies, testing percentages, and refactorable lines. Its ability to refactor, troubleshoot, and analyze the environment really impressed me—incredibly powerful, and noticeably more capable than previous models. This model brings:

Improved reasoning: better capability for multi-step complex problems and strategic planning.
Advanced programming: greater precision in debugging, refactoring, and complex system architecture.
Better instruction following: improved adherence to constraints and specific requirements.
Longer context: better handling of large codebases and long conversations.
Lower error rate: significant reduction in hallucinations and logical mistakes.

Model Observations

Gemini Pro 2.5 as the main model:

It has become my "go-to" model for most daily tasks.
Excellent speed/quality ratio.
Very versatile for different types of problems.

Claude Sonnet family (3.7 and 4):

Sonnet 4 is my pick for critical code and important architectural decisions.
Sonnet 3.7 (Thinking) is invaluable when I need to "see" the reasoning process.
The "Thinking" mode has been especially useful to understand why certain solutions work.

GPT-5 for special cases:

Reserved for tasks requiring next-gen capabilities (as it was the latest model available during the pilot). I didn’t find it particularly useful. I tried 3 or 4 prompts, and it didn’t return the expected answers.

Consistency across models:

All models share access to the same persistent memory system and memory bank.
Switching models does not affect the project’s accumulated knowledge.
Sometimes I test the same problem with multiple models to compare approaches.

Using Plans to Organize the Project

One of the most effective patterns I’ve developed while working with Cascade is the systematic use of written plans to organize and document complex work. These plans function as:

Roadmaps for large implementations.
Living documentation for the project.
Reference points to resume work after interruptions.
Historical record of architectural and technical decisions.
They are not static plans—as we write and interact, we may realize that certain decisions don’t yield what we expected. In those cases, we ask Cascade to rewrite plan X with the best strategy, even setting specific constraints to avoid hallucinations.

Structure of the plans/ Folder

I’ve organized different plans in my project categorized by plan type:

General Plan

plans/plan-general.md

In this plan we cover:

Complete project vision.
Main modules (Products, Checkout, Payments).
Technologies used (Java 17, React, PostgreSQL, MongoDB).
Implementation phases.
Current status and next steps.

Excerpt from the general plan:

### Phase 1: Project Setup
1. Create the project folder structure
2. Set up the development environment
3. Configure dependencies and build tools
4. Set up database connections

Module Implementation Plans

Detailed plans for each system component:

plan-modulo-productos.md – CRUD and catalog management.
plan-modulo-checkout.md – Shopping cart and purchase flow.
plan-modulo-pagos.md – Integration with Bizum, Redsys, and wire transfers.
plan-modulo-mail-pedido.md – Email notification system.
plan-integracion-cart-checkout.md – Frontend-backend integration.
Typical pattern of these plans:

## Objective
[Clear description of the module]

Architecture (3 Layers)
   Presentation Layer (Controller)
   Service Layer (Service)
   Persistence Layer (Repository)

Detailed Implementation
   [Technical step-by-step]
   Testing
   [Testing strategy]

Refactoring Plans

These plans document major architectural changes:

plan_user_unification.md – Unification of users/admin_users tables.

Example of a refactoring plan:

# Refactoring Plan: User Unification

Phase 1: Analysis and Preparation
  [x] Problem Identification
  [x] Architectural Decision

Phase 2: Backend Refactor
  [ ] Modify User.java entity
  [ ] Remove AdminUser.java
  [ ] Unify UserDetailsService
  [ ] Update SecurityConfig

Phase 3: Verification and Testing
  [ ] Update unit tests
  [ ] Run E2E test suite

Phase 4: Final Cleanup
  [ ] Remove obsolete files
  [ ] Update project memory

Testing Plans

Comprehensive strategies for different types of tests:

e2e_testing_setup.md – Playwright setup with Maven.
plan_gatling_tests.md – Load and performance testing.

E2E Plan (excerpt):

### Phase 1: Dependency Isolation
1. Maven Profile (e2e-test) in Backend
   - Exclude spring-kafka and Testcontainers
   - Skip unit tests

### Phase 2: Controlled Backend Launcher
   - Create InfiniaSportsE2ETestApplication.java
   - Use SpringApplicationBuilder
   - Force e2e-test profile

### Phase 3: Lifecycle Orchestration with Maven
   - exec-maven-plugin in playwright-tests
   - pre-integration-test: Start backend and frontend
   - integration-test: Run Playwright tests
   - post-integration-test: Stop servers

Infrastructure Plans

Configuration of external services and deployment:

plan-kafka-carga-productos.md – Kafka integration for asynchronous data loading.
plan-despliegue-gcp-cliente.md – Deployment to Google Cloud Platform.
plan-despliegue-gcp-gratuito.md – Free tier deployment alternative.

Frontend Plans

plan-frontend-react.md – React component structure.
plan-proxy-admin-frontend.md – Proxy configuration for the admin frontend.

Documentation Plans

plan-openapi.md – OpenAPI 3.0 specification for the API.

Plan-Based Working Methodology

Let’s break down the step-by-step methodology for working with plans.

Creating the plan with Cascade when facing a complex task:

"We're going to implement E2E tests with Playwright. 
Before we begin, create a detailed plan in plans/e2e_testing_setup.md 
including phases, dependencies, configuration, and orchestration strategy."

Cascade researches, proposes a structure, and creates the document.
Plan-guided iteration. The plan becomes the working guide:

"According to the plan in plans/e2e_testing_setup.md, 
let’s implement Phase 1: Dependency Isolation."

Cascade consults the plan and executes that specific phase.
Plan updates during development. As I progress, I update the status:

"Update plan_user_unification.md:  
mark Phase 2 as completed and add notes  
about the race condition issue we encountered."

Plans as project memory. The plans document key technical decisions.

Real example (plan-kafka-carga-productos.md):

## 4.1. Robust persistence and idempotency

The DataInitializer assigns a unique UUID to each product before sending it to Kafka.

The consumer (ProductConsumer) converts the id from String to UUID and checks if it already exists (existsById) before saving.

Idempotency control: products are not duplicated even if the flow is repeated.

What are the advantages of this approach?

We can identify 6 main advantages:

Clarity. I always know what comes next.
Traceability. History of technical decisions.
Collaboration. Documentation that others can follow.
Recovery. Easy to resume after interruptions.
Validation. Clear progress checkpoints.
Learning. Record of solutions to complex problems.

Integration with Cascade’s memory system

Plans and memories complement each other. Plans contain structured documentation and implementation phases, while memories store rules and decisions that Cascade must always remember.

Example:

The plan-kafka-carga-productos.md documents the implementation.
A memory stores: "The Product id field is a String generated with uuid2, not a native UUID".
Cascade uses the memory in future implementations without needing to consult the plan every time. Additionally, we ask it to store it in the memory bank. This way, we always have both a runtime memory and a persistent one.

Conclusion

Cascade by Windsurf does not replace us as developers, but provides a complete, synchronized work environment with the ability to support us at key moments in design and development.

It also boosts our productivity, which I’ve estimated between 80%-100%. So, even though it’s not a perfect tool and we still need to pilot the ship, I believe it is essential today to increase our productivity. Have you tried it? I’d love to hear your thoughts.

Raúl Martínez

Computer Engineer with 20 years of experience in application development and leading multidisciplinary teams. After 8 years at Indra, where he worked on projects for the Ministry of Education, DGT (General Directorate of Traffic), RFEF (Royal Spanish Football Federation), and SELAE (State Lotteries and Gambling), he saw Paradigma as an opportunity to continue growing and developing as a professional. Now, after 10 years at the company and having deepened his knowledge of various technologies and work methods, he looks forward to continuing to improve and learn in this wonderful profession.

View more of Raúl.

More thoughts about this.

Machine Learning made easy: an introduction to PyTorch.

By Juan Iglesias

Agentic AI: The Next Frontier in Artificial Intelligence

Agentic AI: The Next Frontier in Artificial Intelligence.

By José María Hernández de la Cruz

Langfuse vs. LangSmith II: Datasets and Evaluation

Langfuse vs. LangSmith II: Datasets and Evaluation.

By Miguel y Leticia

Tell us what you think.

Comments are moderated and will only be visible if they add to the discussion in a constructive way. If you disagree with a point, please, be polite.

Windsurf Cascade: Guide and Best Practices.