Autocomplete is a that uses a combination of retrieval methods and response processing techniques. The system can be understood in roughly three parts.
In order to display suggestions quickly, without sending too many requests, we do the following:
Continue uses a number of retrieval methods to find relevant snippets from your codebase to include in the prompt.
Language models aren’t perfect, but can be made much closer by adjusting their output. We do extensive post-processing on responses before displaying a suggestion, including:
We will also occasionally entirely filter out responses if they are bad. This is often due to extreme repetition.
You can learn more about how it works in the Autocomplete deep dive.