Replace Text in a Stream: Introduction
What is the best way to replace a string in a stream?
I was working with request and response body transforms in YARP when I read the following in the documentation:
The below example uses simple, inefficient buffering to transform requests. A more efficient implementation would wrap and replace HttpContext.Request.Body with a stream that performed the needed modifications as data was proxied from client to server. That would also require removing the Content-Length header since the final length would not be known in advance. (source)
This tickled my curiosity and I went on a quest for that more efficient
implementation. During that quest I learned about Span<T>
and Pipes
and
the surprising power of regex. I got some help from friends
and used new tools that helped me to find the most efficient way of
replacing a string in a stream.
Use Case
My use case is replacing URLs in requests to and responses from a backend
service. Traditionally this would be handled by adding a X-Forwarded-Host
header to the request. Alas, the backend service does not implement this
feature and we do not control the backlog. So we’re stuck with replacing the
URLs ourself.
This use case gives me some boundaries:
- The requests and responses can be large, but not huge
- The format is most likely JSON, so we cannot use newline as delimiter
- The string to replace will be limited in size
- Matching must be case insensitive (this is a biggy, as you’ll see)
Methodology
I defined the following simple interface and implemented it using the different solutions:
Task Replace(Stream input, Stream output, string oldValue, string newValue, CancellationToken cancellationToken = default);
I used Benchmark.net to check how long the method takes to run and how much memory it consumes. The benchmarking was done with a 2MB text file containing the Lorem Ipsum text where the word lorem is replaced by schorem.
I also created a simple website with a minimal api with endpoints for the solutions to check if the methods can actually be used in real life. Of course I also wrote some unit test. The Alba library is really handy to test the minimal api endpoints.
In my journey I explored various corners of the dotnet framework and made some new friends along the way.