DSPy Examples
Current State
Important: DSPy itself does NOT support streaming. When you call a DSPy module, it:
- Sends the complete prompt to the LLM
- Waits for the complete response
- Returns the full result
While Snakepit’s gRPC adapter supports streaming at the transport layer, this doesn’t help with DSPy because DSPy doesn’t yield partial results.
Examples
Q&A Examples
simple_qa_demo.exs
- Basic Q&A using the default EnhancedPython adaptermix run examples/dspy/simple_qa_demo.exs
grpc_qa_demo.exs
- Q&A using the gRPC adapter (requires gRPC dependencies)# First ensure gRPC is available: # mix deps.get && mix deps.compile mix run examples/dspy/grpc_qa_demo.exs
adapter_comparison.exs
- Compares EnhancedPython vs GRPCPython adaptersmix run examples/dspy/adapter_comparison.exs
Simulated Streaming
simple_streaming_demo.exs
- Demonstrates UI techniques for progressive display:
- Word-by-word display
- Chunk-based display
- Sentence-by-sentence display
This is NOT real streaming - it’s just breaking up the complete response for better UX.
Streaming Techniques
Progressive Display
words = String.split(answer, ~r/\s+/)
Enum.each(words, fn word ->
IO.write("#{word} ")
IO.binwrite(:stdio, "") # Force flush
Process.sleep(50) # Simulate typing
end)
Chunked Output
chunks = Enum.chunk_every(words, 5)
Enum.each(chunks, fn chunk ->
IO.write(Enum.join(chunk, " "))
IO.binwrite(:stdio, "")
Process.sleep(200)
end)
Adapter Comparison
Adapter | Protocol | Streaming Support | Use Case |
---|---|---|---|
EnhancedPython | stdin/stdout | ❌ No | Default, simple, lower overhead |
GRPCPython | gRPC/HTTP2 | ✅ Yes* | High throughput, distributed systems |
*Note: Adapter supports streaming, but DSPy doesn’t use it
Future Work
True streaming with DSPy would require:
- DSPy Changes: DSPy itself would need to support yielding partial tokens
- LLM Provider Support: The underlying LLM API must support streaming
- New Bridge Protocol: A streaming-aware protocol between Elixir and Python
- New DSPex APIs: Stream-based module interfaces
Benefits of Streaming
Even simulated streaming provides:
- Better user experience with immediate feedback
- Progressive display of long responses
- Natural feel for conversational AI
- Reduced perceived latency
Implementation Notes
The key to simulating streaming in Elixir:
- Split the response into manageable chunks
- Use
IO.write/1
for output without newlines - Force flush with
IO.binwrite(:stdio, "")
- Add delays with
Process.sleep/1
This creates a typing effect that mimics real streaming behavior.