Sometimes we land on an unfamiliar GitHub repository and the first problem is not writing code. The real problem is understanding the project fast enough. Is this a REST API? Where are the entrypoints? How is the application wired? Are there obvious risks in the codebase? If the repository is big enough, answering those questions manually is slow and boring.
That’s just my PoC. An interactive command-line application that can inspect any public GitHub repository and answer questions about it.

I have the feeling this workflow should exist natively on GitHub. Once repositories become large enough, being able to ask architecture, audit, or API questions feels like a natural evolution of code search and Copilot. Maybe the reason it does not exist yet is cost, scope, or product complexity. In the meantime, a CLI-first open source approach feels like a good place to start: simple, scriptable, hackable, and based on bring-your-own-model credentials so each user keeps control of their own usage and billing.
The idea is simple. We give a GitHub repository to a CLI application. The CLI creates a local checkout, exposes a small set of repository-aware tools to a Strands Agent, and lets the agent inspect the project with AWS Bedrock. Because the agent can list directories, search code and read files, we can ask practical questions such as:
- Explain how the project works
- Audit the codebase looking for risks
- List the API endpoints
- Describe the execution flow of a specific module
This is not a vector database project and it is not a RAG pipeline. It is a much simpler approach. We let the agent explore the repository directly, file by file, using tools.
The architecture
The flow is straightforward:
- The user calls the CLI with a GitHub repository.
- The repository is cloned into a local cache.
- A Strands Agent is created with a Bedrock model.
- The agent receives a system prompt plus four tools:
get_directory_tree,list_directory,search_codeandread_file. - The agent inspects the repository and returns the final answer in Markdown.
This is enough for a surprising number of use cases. If the system prompt is focused on architecture, the answer becomes an explanation. If the prompt is focused on risk, the answer becomes a code audit. If the prompt is focused on HTTP routes, the answer becomes an API inventory.
Project structure
I like to keep configuration in settings.py. It is a pattern I borrowed years ago from Django and I still use it in small prototypes because it keeps things simple:
src/
└── github_kb/
├── cli.py
├── settings.py
├── commands/
│ ├── ask.py
│ ├── audit.py
│ ├── chat.py
│ ├── endpoints.py
│ └── explain.py
├── lib/
│ ├── agent.py
│ ├── github.py
│ ├── models.py
│ ├── prompts.py
│ ├── repository.py
│ └── ui.py
└── env/
└── local/
└── .env.example
The responsibilities are small and explicit:
github_kb/commands/contains the Click commands.github_kb/lib/github.pyresolves the GitHub repository and manages the local checkout.github_kb/lib/repository.pycontains the repository exploration logic used by the agent tools.github_kb/lib/agent.pywires Strands Agents with AWS Bedrock.github_kb/lib/prompts.pykeeps the system prompt and the task-specific prompts in one place.
Why this works
Large repositories are difficult because we rarely need the whole repository at once. We normally need a guided exploration strategy. A tree view helps us identify the shape of the project. Search helps us jump to the interesting files. Reading files gives us the final confirmation.
That sequence maps very well to tool-based agents.
Instead of trying to send the whole repository in one prompt, the model can progressively inspect only the relevant parts. It is cheaper, easier to reason about, and much closer to how we inspect an unknown codebase ourselves.
Install
The intended installation flow is:
pipx install github-kb
Quick start
The happy path should look like this:
aws sso login --profile sandbox
AWS_PROFILE=sandbox AWS_REGION=us-west-2 github-kb doctor
AWS_PROFILE=sandbox AWS_REGION=us-west-2 github-kb chat gonzalo123/autofix
The CLI is designed to work out of the box with the standard AWS credential chain. That means it can use:
AWS_PROFILEAWS_REGIONaws sso login- regular access keys if they are already configured in the environment
By default, github-kb uses global.anthropic.claude-sonnet-4-6 unless BEDROCK_MODEL_ID or --model says otherwise.
You can also override the runtime explicitly with CLI flags such as --aws-profile, --region, and --model.
Usage
Now we can ask questions:
github-kb ask gonzalo123/autofix "How does the automated fix flow work?"
github-kb chat gonzalo123/autofix
github-kb explain gonzalo123/autofix --topic architecture
github-kb audit gonzalo123/autofix --focus github
github-kb endpoints gonzalo123/autofix
github-kb doctor
If we want to keep the same conversation alive across multiple questions in one terminal session:
github-kb chat gonzalo123/autofix
It also accepts full GitHub URLs:
github-kb ask https://github.com/gonzalo123/autofix "Where is the application bootstrapped?"
If we want to refresh the local cache:
github-kb audit gonzalo123/autofix --refresh
We can also pass the AWS runtime explicitly:
github-kb chat gonzalo123/autofix --aws-profile sandbox --region eu-central-1
github-kb ask gonzalo123/autofix "Explain the architecture" --model global.anthropic.claude-sonnet-4-6
Demo screenshots
Here are a few real screenshots generated against one of my own repositories, gonzalo123/autofix.
The screenshots below are embedded as PNG files:
explain

endpoints

audit

A couple of notes
This is still a PoC. The goal is not to build a perfect repository analysis platform. The goal is to validate a simple idea: an agent with a tiny set of well-chosen tools can already be useful for code understanding.
There are several obvious next steps:
- add more repository-aware tools
- persist analysis sessions
- summarize previous findings before starting a new question
- support GitHub authentication for private repositories
- add specialized prompts for security reviews or framework-specific inspections
Even in its current state, it is already a nice example of how tool-based agents can help with a very real developer problem.
Full code in my github





















