Recently, I’ve been working with Cascade Windsurf integrated into IntelliJ to develop an AI-driven pilot project while testing different techniques. Cascade is an advanced AI assistant that offers capabilities such as autonomous execution of actions and implementation of complete solutions, a deep understanding of project context, specialized tools, true multitasking, and persistent memory that stores important steps taken during task execution.
In this post, I’ll describe my experience with this tool over the last 3 months. In the next one, I’ll talk about using the interface, its rules, and the working modes (chat and code).
We’ll cover models, capabilities, model usage modes (“work models”), planning strategies… A complete guide showing how we’ve used this powerful tool—always with the goal of improving and embracing new ways and paradigms.
What Is Windsurf Cascade?
Cascade is an agentic AI assistant developed by Windsurf, specifically designed for pair programming and software development. Unlike a simple code autocomplete, Cascade is an autonomous agent capable of:
- Understanding full context: it analyzes your project, open files, code structure, and architectural decisions.
- Executing actions: it doesn’t just suggest code— it can edit files, run commands, navigate the project, and validate changes.
- Reasoning and planning: it breaks down complex tasks into steps, researches solutions, and proposes strategies before implementing them.
- Learning from the project: it uses a persistent memory system to remember rules, patterns, and decisions specific to your codebase.
In essence, it’s a development partner that understands both the code and your project context.
Key Features
1 Deep IDE Integration
Cascade integrates natively with IDEs like IntelliJ IDEA and VS Code, providing:
- Access to open files and cursor position.
- Ability to edit multiple files simultaneously.
- Execution of system commands (Maven, npm, git, etc.).
- Intelligent project navigation.
2 Multiple AI Models
You can switch between different models depending on the task:
- Gemini Pro 2.5 for balanced speed/quality.
- Claude Sonnet 4 for maximum precision.
- Claude Sonnet 3.7 (Thinking) for visible reasoning.
- GPT‑5 for advanced capabilities.
- (New) Claude Sonnet 4.5 (Thinking) for deeper internal reasoning, better planning for complex tasks, and higher accuracy. Extremely powerful.
- Etc.
3 Persistent Memory System with a Memory Bank
- Stores architectural decisions.
- Remembers project-specific rules.
- Keeps context across conversations.
- Avoids repeating explanations.
4 Productivity Tools
- Semantic search across the codebase.
- Plans and TO‑DO lists for complex tasks.
- Safe command execution with user confirmation.
- Automatic test validation.
- Advanced search (grep_search) for identifying code patterns.
- Integration with MCP servers (Cloud Run, Context7) for up‑to‑date documentation.
5 Agentic Capabilities
- Autonomous research. It looks up best practices before implementing solutions.
- Dependency analysis. Understands change impact (not always perfectly).
- Intelligent refactoring. Updates tests and docs automatically (not always flawlessly).
- Systematic debugging. Reproduces errors, analyzes root causes, and proposes solutions (this part is amazing— it solves issues by reading logs or adding its own debugging logs extremely quickly).
Working Modes with Cascade
After 3 months working with this tool, I’ve identified several modes of interaction with Cascade depending on the nature of the task:
Research Mode
This is the mode I use when I don’t know the best solution:
"Research the standard pattern for handling refresh tokens in Spring Boot. Explain the options to me and recommend one before implementing."
Implementation Mode
In this case, I use it for clear and well‑defined tasks:
"Create a POST /api/orders endpoint that receives an OrderDTO, validates the data, and saves the order in the database using OrderService."
Debugging Mode
This would be the interaction mode to resolve errors:
"I’m getting a 401 error on /api/orders. Stack trace attached. Check JwtAuthenticationFilter and SecurityConfig."
Architecture Mode
For design decisions:
"I want to unify the User and AdminUser entities. Analyze the impact and propose a step-by-step migration strategy."
Learning Mode
This mode can be used to understand existing code:
"Explain to me how the refresh token system we implemented works. How does it prevent race conditions?"
Validation Mode
For post-implementation, we would say:
"Run the E2E tests to verify that the checkout flow works correctly."
Which AI Models Have We Used?
One of Cascade's most powerful features is the ability to choose between different AI models depending on the nature of the task. During the pilot development, I primarily worked with the following models:
Gemini Pro 2.5
- Primary use: one of my most used models for general and complex tasks.
- Strengths:
- Excellent balance between speed and quality.
- Very good at understanding multilingual context (Spanish/English).
- Deep reasoning capabilities.
- Clean and well-structured code generation.
- Use cases: architectural refactorings, implementation of complex features, requirements analysis.
Claude Sonnet 4
- Primary use: tasks requiring maximum precision and detailed analysis.
- Strengths:
- Excellent for analyzing existing code and architecture.
- Highly precise in understanding technical requirements.
- Extremely careful handling of critical changes.
- Superior in technical documentation and detailed explanations.
- Use cases: code security reviews (JWT, authentication), deep architectural analysis, complex system refactorings (User/AdminUser).
Claude Sonnet 3.7 (Thinking)
- Primary use: complex problems that require step-by-step reasoning.
- Strengths:
- Visible and structured "thinking" process.
- Excellent for debugging tough issues.
- Methodical root cause analysis.
- Very good at exploring different approaches.
- Use cases: debugging race conditions (refresh tokens), resolving Kafka errors, deserialization issues.
GPT-5
- Primary use: cutting-edge tasks requiring advanced capabilities.
- Strengths:
- Most advanced reasoning capabilities.
- Excellent understanding of complex context.
- Very good for problems that require technical creativity.
- High-quality code generation.
- Use cases: advanced optimizations, solving unusual problems. So far, it's the one I've used the least and has convinced me the least.
Claude Sonnet 4.5 (Thinking) and 4.5 (New, October ‘25)
At the time of writing, I’ve tested it for a Mapper to MapStruct migration, where it structured a very comprehensive plan with estimates, potential problems and solutions, code examples, dependency injection strategies, testing percentages, and refactorable lines. Its ability to refactor, troubleshoot, and analyze the environment really impressed me—incredibly powerful, and noticeably more capable than previous models. This model brings:
- Improved reasoning: better capability for multi-step complex problems and strategic planning.
- Advanced programming: greater precision in debugging, refactoring, and complex system architecture.
- Better instruction following: improved adherence to constraints and specific requirements.
- Longer context: better handling of large codebases and long conversations.
- Lower error rate: significant reduction in hallucinations and logical mistakes.
Model Observations
Gemini Pro 2.5 as the main model:
- It has become my "go-to" model for most daily tasks.
- Excellent speed/quality ratio.
- Very versatile for different types of problems.
Claude Sonnet family (3.7 and 4):
- Sonnet 4 is my pick for critical code and important architectural decisions.
- Sonnet 3.7 (Thinking) is invaluable when I need to "see" the reasoning process.
- The "Thinking" mode has been especially useful to understand why certain solutions work.
GPT-5 for special cases:
Reserved for tasks requiring next-gen capabilities (as it was the latest model available during the pilot). I didn’t find it particularly useful. I tried 3 or 4 prompts, and it didn’t return the expected answers.
Consistency across models:
- All models share access to the same persistent memory system and memory bank.
- Switching models does not affect the project’s accumulated knowledge.
- Sometimes I test the same problem with multiple models to compare approaches.
Using Plans to Organize the Project
One of the most effective patterns I’ve developed while working with Cascade is the systematic use of written plans to organize and document complex work. These plans function as:
- Roadmaps for large implementations.
- Living documentation for the project.
- Reference points to resume work after interruptions.
- Historical record of architectural and technical decisions.
- They are not static plans—as we write and interact, we may realize that certain decisions don’t yield what we expected. In those cases, we ask Cascade to rewrite plan X with the best strategy, even setting specific constraints to avoid hallucinations.
Structure of the plans/ Folder
I’ve organized different plans in my project categorized by plan type:
- General Plan
plans/plan-general.md
In this plan we cover:
- Complete project vision.
- Main modules (Products, Checkout, Payments).
- Technologies used (Java 17, React, PostgreSQL, MongoDB).
- Implementation phases.
- Current status and next steps.
Excerpt from the general plan:
### Phase 1: Project Setup
1. Create the project folder structure
2. Set up the development environment
3. Configure dependencies and build tools
4. Set up database connections
- Module Implementation Plans
Detailed plans for each system component:
- plan-modulo-productos.md – CRUD and catalog management.
- plan-modulo-checkout.md – Shopping cart and purchase flow.
- plan-modulo-pagos.md – Integration with Bizum, Redsys, and wire transfers.
- plan-modulo-mail-pedido.md – Email notification system.
- plan-integracion-cart-checkout.md – Frontend-backend integration.
- Typical pattern of these plans:
## Objective
[Clear description of the module]
Architecture (3 Layers)
Presentation Layer (Controller)
Service Layer (Service)
Persistence Layer (Repository)
Detailed Implementation
[Technical step-by-step]
Testing
[Testing strategy]
- Refactoring Plans
These plans document major architectural changes:
- plan_user_unification.md – Unification of users/admin_users tables.
Example of a refactoring plan:
# Refactoring Plan: User Unification
Phase 1: Analysis and Preparation
[x] Problem Identification
[x] Architectural Decision
Phase 2: Backend Refactor
[ ] Modify User.java entity
[ ] Remove AdminUser.java
[ ] Unify UserDetailsService
[ ] Update SecurityConfig
Phase 3: Verification and Testing
[ ] Update unit tests
[ ] Run E2E test suite
Phase 4: Final Cleanup
[ ] Remove obsolete files
[ ] Update project memory
- Testing Plans
Comprehensive strategies for different types of tests:
- e2e_testing_setup.md – Playwright setup with Maven.
- plan_gatling_tests.md – Load and performance testing.
E2E Plan (excerpt):
### Phase 1: Dependency Isolation
1. Maven Profile (e2e-test) in Backend
- Exclude spring-kafka and Testcontainers
- Skip unit tests
### Phase 2: Controlled Backend Launcher
- Create InfiniaSportsE2ETestApplication.java
- Use SpringApplicationBuilder
- Force e2e-test profile
### Phase 3: Lifecycle Orchestration with Maven
- exec-maven-plugin in playwright-tests
- pre-integration-test: Start backend and frontend
- integration-test: Run Playwright tests
- post-integration-test: Stop servers
- Infrastructure Plans
Configuration of external services and deployment:
- plan-kafka-carga-productos.md – Kafka integration for asynchronous data loading.
- plan-despliegue-gcp-cliente.md – Deployment to Google Cloud Platform.
- plan-despliegue-gcp-gratuito.md – Free tier deployment alternative.
- Frontend Plans
- plan-frontend-react.md – React component structure.
- plan-proxy-admin-frontend.md – Proxy configuration for the admin frontend.
- Documentation Plans
- plan-openapi.md – OpenAPI 3.0 specification for the API.
Plan-Based Working Methodology
Let’s break down the step-by-step methodology for working with plans.
- Creating the plan with Cascade when facing a complex task:
"We're going to implement E2E tests with Playwright.
Before we begin, create a detailed plan in plans/e2e_testing_setup.md
including phases, dependencies, configuration, and orchestration strategy."
- Cascade researches, proposes a structure, and creates the document.
- Plan-guided iteration. The plan becomes the working guide:
"According to the plan in plans/e2e_testing_setup.md,
let’s implement Phase 1: Dependency Isolation."
- Cascade consults the plan and executes that specific phase.
- Plan updates during development. As I progress, I update the status:
"Update plan_user_unification.md:
mark Phase 2 as completed and add notes
about the race condition issue we encountered."
- Plans as project memory. The plans document key technical decisions.
Real example (plan-kafka-carga-productos.md):
## 4.1. Robust persistence and idempotency
The DataInitializer assigns a unique UUID to each product before sending it to Kafka.
The consumer (ProductConsumer) converts the id from String to UUID and checks if it already exists (existsById) before saving.
Idempotency control: products are not duplicated even if the flow is repeated.
What are the advantages of this approach?
We can identify 6 main advantages:
- Clarity. I always know what comes next.
- Traceability. History of technical decisions.
- Collaboration. Documentation that others can follow.
- Recovery. Easy to resume after interruptions.
- Validation. Clear progress checkpoints.
- Learning. Record of solutions to complex problems.
Integration with Cascade’s memory system
Plans and memories complement each other. Plans contain structured documentation and implementation phases, while memories store rules and decisions that Cascade must always remember.
Example:
- The plan-kafka-carga-productos.md documents the implementation.
- A memory stores: "The Product id field is a String generated with uuid2, not a native UUID".
- Cascade uses the memory in future implementations without needing to consult the plan every time. Additionally, we ask it to store it in the memory bank. This way, we always have both a runtime memory and a persistent one.
Conclusion
Cascade by Windsurf does not replace us as developers, but provides a complete, synchronized work environment with the ability to support us at key moments in design and development.
It also boosts our productivity, which I’ve estimated between 80%-100%. So, even though it’s not a perfect tool and we still need to pilot the ship, I believe it is essential today to increase our productivity. Have you tried it? I’d love to hear your thoughts.
Comments are moderated and will only be visible if they add to the discussion in a constructive way. If you disagree with a point, please, be polite.
Tell us what you think.