Correlated Content

Replace Text in a Stream: Introduction

What is the best way to replace a string in a stream?

I was working with request and response body transforms in YARP when I read the following in the documentation:

The below example uses simple, inefficient buffering to transform requests. A more efficient implementation would wrap and replace HttpContext.Request.Body with a stream that performed the needed modifications as data was proxied from client to server. That would also require removing the Content-Length header since the final length would not be known in advance. (source)

This tickled my curiosity and I went on a quest for that more efficient implementation. During that quest I learned about Span<T> and Pipes and the surprising power of regex. I got some help from friends and used new tools that helped me to find the most efficient way of replacing a string in a stream.

Use Case

My use case is replacing URLs in requests to and responses from a backend service. Traditionally this would be handled by adding a X-Forwarded-Host header to the request. Alas, the backend service does not implement this feature and we do not control the backlog. So we’re stuck with replacing the URLs ourself.

This use case gives me some boundaries:

  • The requests and responses can be large, but not huge
  • The format is most likely JSON, so we cannot use newline as delimiter
  • The string to replace will be limited in size
  • Matching must be case insensitive (this is a biggy, as you’ll see)

Methodology

I defined the following simple interface and implemented it using the different solutions:

Task Replace(Stream input, Stream output, string oldValue, string newValue, CancellationToken cancellationToken = default);

snippet source | anchor

I used Benchmark.net to check how long the method takes to run and how much memory it consumes. The benchmarking was done with a 2MB text file containing the Lorem Ipsum text where the word lorem is replaced by schorem.

I also created a simple website with a minimal api with endpoints for the solutions to check if the methods can actually be used in real life. Of course I also wrote some unit test. The Alba library is really handy to test the minimal api endpoints.

In my journey I explored various corners of the dotnet framework and made some new friends along the way.

Other Posts in this series

  1. Introduction
  2. String Replace
  3. StreamReader and StreamWriter
  4. Sidestep: Regex