Chat REPL: Conversational Coding with Context Memory

We’ve covered context optimization and smart file selection. But what about maintaining context across multiple questions?

The Problem: One-Shot Interactions

Traditional AI coding assistants are conversational… sort of.

The issue:

$ tool "explain this function"
[explanation]

$ tool "now refactor it"
[error: what function?]

Each invocation starts fresh. No memory, no context, no follow-up.

You end up:

Repeating context in every query
Copy-pasting code between prompts
Losing thread of the conversation
Forgetting what you asked 3 questions ago

The Solution: Chat REPL

GPTCode’s gptcode chat is a persistent REPL (Read-Eval-Print Loop) with:

Conversation history
File context awareness
Token management
Save/load conversations

Basic Usage

$ gptcode chat

> explain the authentication flow
[GPTCode reads auth.go, middleware.go, session.go]
The authentication flow works as follows:
1. User submits credentials to /login
2. Middleware validates JWT token
3. Session stored in Redis...

> how is the JWT token generated?
[GPTCode remembers context, focuses on token logic]
The JWT token is generated in auth/token.go:
- Uses HMAC-SHA256 signing
- Includes user_id, email, exp claims
- Token expires in 24 hours...

> add refresh token support
[GPTCode knows you're extending the JWT system]
I'll add refresh token functionality:
1. Create refresh_token table
2. Generate long-lived refresh token (30 days)
3. Add /refresh endpoint
4. Update login to return both tokens...

Key difference: GPTCode remembers the conversation. Question 2 builds on Question 1. Question 3 knows we’re working on auth + JWT.

Features

1. Conversation History

Tracks all messages with metadata:

{
  "role": "user",
  "content": "explain the authentication flow",
  "timestamp": "2025-12-04T10:30:00Z",
  "token_count": 8
}

Benefits:

Context preserved across questions
No need to repeat yourself
Natural follow-up questions

2. File Context Awareness

Automatically includes relevant files based on:

Current directory
Recent git changes
Mentioned file names
Dependency graph

Example:

> fix bug in auth/handler.go

GPTCode sees:

auth/handler.go (explicitly mentioned)
auth/middleware.go (dependency)
models/user.go (used by handler)
Recent changes in git diff

3. Token Management

Tracks context window usage:

> /context
Context Manager Status:
  Total Messages: 8
  Total Tokens: 2,450 / 8,000 (30.6%)
  Oldest Message: 2025-12-04 10:25:03
  Recent File Updates: 3 minutes ago

Automatic management:

Keeps last 50 messages (configurable)
Drops oldest when exceeding limit
Summarizes long messages
Preserves important context

4. REPL Commands

Command	Description
`/exit`, `/quit`	Exit chat
`/clear`	Clear conversation history
`/save <file>`	Save conversation to file
`/load <file>`	Load previous conversation
`/context`	Show context statistics
`/files`	List files in context
`/history`	Show conversation history
`/help`	Show all commands

5. Save/Load Conversations

Continue later where you left off:

> /save auth-discussion.json
Conversation saved to auth-discussion.json

# Later, different terminal session
$ gptcode chat
> /load auth-discussion.json
Conversation loaded from auth-discussion.json

> continue with the refresh token implementation
[GPTCode has full context from previous session]

Real-World Workflow

Scenario: Adding Feature to Existing Code

$ gptcode chat

> show me the current user registration flow
[GPTCode reads user.go, handler.go, validates.go]
Current registration:
1. POST /register with email, password
2. Validate email format
3. Hash password with bcrypt
4. Save to database
5. Send welcome email

> what validation rules exist?
[Remembers we're discussing registration]
Current validation:
- Email: Must be valid format
- Password: Min 8 chars
- No duplicate email check yet ⚠️

> add duplicate email validation
[Knows exact context: registration + validation]
I'll add duplicate check:
1. Before insert, query users by email
2. Return 409 Conflict if exists
3. Update validation error messages

[code changes here]

> also add password strength requirements
[Still in context of registration + validation]
Adding password strength:
- Min 8 chars (existing)
- Must include: uppercase, lowercase, number, special char
- Updated error messages

[code changes here]

> write tests for these validations
[Full context: registration + validation + recent changes]
I'll create tests:
- TestDuplicateEmailRejected
- TestWeakPasswordRejected
- TestValidPasswordAccepted

[test code here]

> /save user-registration-improvements.json
Conversation saved.

Result: 5 related changes, all with proper context. No repetition, natural flow.

Initial Message Support

Start with a question immediately:

$ gptcode chat "explain the database schema"
[Processes question and stays open]
The database has 5 main tables:
...

> how is user data encrypted?
[Follow-up in same session]

Non-interactive mode (for CI/scripts):

$ gptcode chat "generate migration for new column" | tee migration.sql
[Processes and exits]

REPL detects TTY:

TTY present → stays open for follow-up
Piped/redirected → processes and exits

Advanced: Context Control

View Current Context

> /context
Context Manager Status:
  Total Messages: 12
  Total Tokens: 4,823 / 8,000 (60.3%)
  Cache Hit Rate: 71%
  Oldest Message: 2025-12-04 10:15:22
  Recent File Updates: 2 minutes ago

Check Files in Context

> /files
Files in context:
  internal/auth/handler.go
  internal/auth/middleware.go
  internal/models/user.go
  config/security.go

Clear and Start Fresh

> /clear
Conversation history cleared.

> [start new topic]

Useful when:

Switching to unrelated task
Context getting too large
Want clean slate

Technical Details

Context Manager

Implements sliding window with smart truncation:

type ContextManager struct {
    messages     []Message
    maxTokens    int  // 8000 default
    maxMessages  int  // 50 default
    fileContext  string
}

func (cm *ContextManager) AddMessage(role, content string, tokens int) {
    cm.messages = append(cm.messages, Message{
        Role: role,
        Content: content,
        Timestamp: time.Now(),
        TokenCount: tokens,
    })
    
    // Auto-truncate if needed
    if cm.getTotalTokens() > cm.maxTokens {
        cm.truncateOldest()
    }
}

Token Estimation

Uses simple heuristic (will be improved):

func estimateTokens(text string) int {
    // Rough estimate: 1 token ≈ 4 characters
    return len(text) / 4
}

Future: Use tiktoken for accurate counts.

File Context Updates

Refreshes every 5 minutes or on demand:

func (cm *ContextManager) UpdateFileContext() error {
    // Get git status
    changed, _ := getGitChangedFiles()
    
    // Get cwd files
    local, _ := listLocalFiles(".")
    
    // Build context string
    cm.fileContext = buildFileContext(changed, local)
    return nil
}

Integration with Other Features

Dependency Graph

Chat REPL uses graph to find relevant files:

> fix bug in authentication
[Graph selects: auth.go, middleware.go, session.go]
[Includes these in context automatically]

Model Selection

Each message uses optimal model:

> explain this code
[Uses query agent: gemini-2.0-flash-exp:free]

> refactor it
[Uses editor agent: llama-3.3-70b-versatile]

ML Intent Classification

Routes messages to right agent:

> how does caching work?
[ML: query intent → research agent]

> add Redis caching
[ML: editor intent → editor agent]

Comparison: Chat REPL vs One-Shot

Aspect	Chat REPL	One-Shot (traditional)
Context	Preserved	Lost
Follow-up	Natural	Requires full context
Efficiency	High (no repetition)	Low (repeat context)
Token usage	Optimized (shared context)	Wasteful (duplicate context)
UX	Conversational	Transactional
Best for	Exploration, debugging	Simple queries

Configuration

Adjust Context Limits

# Max tokens (default: 8000)
gptcode config set defaults.chat_max_tokens 16000

# Max messages (default: 50)
gptcode config set defaults.chat_max_messages 100

Higher limits = more context, but slower and more expensive Lower limits = less context, faster, cheaper

Change Model

Uses default profile models, but can override:

gptcode chat --model gpt-4

Best Practices

1. Use Descriptive First Message

# Good
gptcode chat "I'm working on the auth module, explain the JWT flow"

# Less good
gptcode chat "explain"

Better context → better responses.

2. Save Important Conversations

> /save feature-x-discussion-2025-12-04.json

Reference later or share with team.

3. Clear When Switching Topics

> /clear

Prevents context pollution.

4. Check Context Periodically

> /context

If at 80%+ tokens, consider /clear or /save and start fresh.

5. Use Initial Message for CI

gptcode chat "check if PR follows coding standards" < pr_diff.txt

Keyboard Shortcuts

Based on readline library:

Ctrl+D - Exit (EOF)
Ctrl+C - Interrupt (shows hint)
Up/Down - History navigation
Ctrl+A - Start of line
Ctrl+E - End of line
Ctrl+K - Kill to end
Ctrl+U - Kill to start

History saved to ~/.gptcode_history.

Error Handling

Lost Connection

> explain database schema
Error: connection timeout

> [retry automatically]

Chat REPL stays open on errors.

Invalid Command

> /invalid
Unknown command: /invalid (type /help for available commands)

Context Overflow

[Warning: Context at 95% capacity (7600/8000 tokens)]
[Automatically dropping 3 oldest messages]

Transparent, automatic management.

Future Enhancements

1. Voice Input

$ gptcode chat --voice
> [speak] "explain authentication"

2. Multi-File Editing

> refactor auth module
[GPTCode: I'll update 4 files, continue? y/n]

3. Interactive Diff Preview

> add logging
[Shows diff, ask for confirmation]
[Apply with /accept, reject with /reject]

4. Collaborative Sessions

> /share
Session URL: chu.dev/s/abc123
[Others can join and see conversation]

5. Semantic Search in History

> /search "password validation"
[Shows relevant messages from history]

Getting Started

1. Start Chat

gptcode chat

2. Ask Questions

> explain how routing works
> where is the config loaded?
> fix the bug in handler.go

3. Use Commands

> /context
> /files
> /history

4. Save Work

> /save today-discussion.json

5. Exit

> /exit

Community Examples

Example 1: Debugging Session

> why is the API returning 500?
> check the logs
> what's in error.log?
> the database connection is nil
> where is the database initialized?
> fix the initialization order

Result: 6-step debugging, all in context.

Example 2: Feature Development

> I want to add rate limiting
> show current middleware stack
> add rate limiter before auth
> configure it in security.go
> write tests
> /save rate-limiting.json

Result: Complete feature, documented conversation.

Example 3: Code Review

> /load pr-342-review.json
> continue reviewing the changes
> check error handling in handler.go
> suggest improvements

Result: Resumed previous review session.

Summary

Chat REPL delivers:

Persistent context (no repetition)
Natural follow-up (conversational)
File awareness (automatic relevance)
Token management (optimized costs)
Save/load (continuity)
REPL commands (control)

Try it today:

gptcode chat

Have questions about Chat REPL? Join our GitHub Discussions