The Challenge
Building an AI assistant that can understand natural language commands and take action across multiple external tools is no small feat. Users don't follow scripts. They say things like:
- "create a ticket for this"
- "make a jira issue from the thread above"
- "log this as a P1 incident and assign it to Sarah"
Why Claude
We evaluated several LLM providers and ultimately chose Anthropic's Claude for a few key reasons:
Our Architecture
Quikly's AI layer works in three stages:
1. Intent Classification
When a user mentions @Quikly, we first classify what they're trying to do. We define a set of known intents:
create_issue- Create a new Jira/Linear issueupdate_issue- Update an existing issuesearch_issues- Query for issues matching criteriacreate_doc- Create a Confluence/Notion pagepost_update- Post a message to a channel
2. Parameter Extraction
Once we know the intent, Claude extracts the relevant parameters. For a create_issue intent, this might include:
- Summary (from the thread or explicitly stated)
- Priority (P1, P2, etc.)
- Issue type (Bug, Incident, Task)
- Assignee (if mentioned)
- Project (from channel mapping or explicit)
3. Tool Orchestration
Finally, Claude decides which tools to call and in what order. For complex requests like "create an incident and draft a postmortem," this might involve:
The Prompt Engineering
Getting consistent, reliable behavior from an LLM requires careful prompt engineering. Here's what we learned:
Be explicit about the output format. We use structured tool definitions that clearly specify the expected parameters and their types. Provide examples. Few-shot examples in the system prompt dramatically improve consistency. Handle edge cases in the prompt. What should happen if the user doesn't specify a project? If the thread is empty? We address these explicitly. Test extensively. We maintain a test suite of real user messages to ensure changes don't break existing behavior.Results
After months of iteration, Quikly now correctly handles ~95% of user requests on the first try. The remaining 5% are typically ambiguous requests where asking for clarification is the right behavior.
What's Next
We're continuing to improve our AI layer:
- Multi-turn conversations - Remembering context across multiple messages
- Learning from corrections - When users correct Quikly, using that to improve
- Proactive suggestions - Recognizing patterns and offering to help before being asked