Developing an AI prototype
TL;DR: If you want to brush up on your programming skills, make some nice money and work in an environment with clever, friendly people: go do the Development Summer Internship!!
Artificial intelligence is the buzzword of the decade, no question about it. While it used to be very difficult to develop an AI product from scratch (unless you were an experienced researcher), now the entry barrier is so low that even students can implement something meaningful. This was the premise of this year’s TOPdesk Development Summer Internship: get a handful of young people with affinity for programming and let them develop a technical product from scratch. The only common denominator for this year’s group was that everyone was highly motivated and a student with at least some experience with programming; besides this, the group was a nice blend of different nationalities, personalities and skillsets.
The first two weeks of the internship are all about getting everyone up to speed with the latest technologies: Git, Scrum, Java, Vue and much more. During these 2 weeks, the interns are mentored by a senior programmer that has seen it all. A crash course on Java is given that teaches the interns all the ins and outs of the Object Oriented Programming paradigm. During the second week, different people from TOPdesk came to give presentations on certain topics; kind of like a lecture. Since the group size was compact (12 interns), it was very easy to ask questions and have discussions during the presentations. Seeing as I already had 3 years of programming experience under my belt, the coding exercises were straightforward. Nevertheless, there wasn’t a dull moment because I learned that being a developer is so much more than programming: frequent communication with the stakeholders to make sure that the work you’re doing is really what the customer wants. Clearly setting concrete goals and having short development cycles is a cornerstone of development as well.
The inner workings of TOPdesk were explained as well: TOPdesk is a service management and ticketing software package: any company with a servicedesk can use TOPdesk to register complaints and questions that their customers have. Our project would revolve around helping these servicedesk operators find their way through their company’s online knowledge base.
The Design Sprint
Knowing how to be agile and how to implement polymorphism in Java is all really interesting, but there’s still a huge chasm between having these skills and knowing what to do with them. That’s why the third week was all about designing, drawing, brainstorming and setting the scope for the project: what awesome product were we going to present after 6 weeks? The focus of this week was squeezing out as many ideas as possible from our brain, which resulted in a wall covered in ‘user stories’: small hand drawn 4-panel comics where a user had a problem that our product could solve.
Collecting feedback from actual users was the next (and probably the most important) step: we rapidly prototyped the three best designs and presented them to support operators: the end-users. Based on their opinion, we mixed the best parts of these three ideas into an amalgam of awesomeness:
A.I.K.I. (Artificial Intelligence Knowledge Item)-search:
For Servicedesk / support operators
Who have difficulties searching for knowledge items
Is an intuitive, smart search tool
That has an AI-supported suggestion system
Unlike the current TOPdesk suggestion widget
Our product not only searches for keywords, but also takes context into account.
Let the programming begin!
The following 5 weeks revolved around getting a working proof of concept ready. Before we could make any decisions on how to approach the problem (what algorithms to use, which languages the code would be in), we had to do some research. As complete AI-newbies, we didn’t know where to start. Luckily, we got some advice from an expert that works at TOPdesk in Hungary. He, and other people working at TOPdesk were happy to speak to us in a video conference. We fiddled around with some machine learning libraries in Python and got a clear picture of what route to take by the end of week 3.
The workflow of the remaining weeks was in line with the Agile paradigm: make short bursts of progress (sprints), make sure what you’re doing is in line with the customer’s expectation, and present your work to people at the end of the week during the Sprint Review. Sprint reviews are a nice way of informally sharing your progress on the project to anyone who is interested, and they offer a way to practice your presentation skills as well as your ability to explain a technical subject to an audience. If you’re lucky, a knowledgeable developer might give you some valuable feedback.
This year’s summer internship was an experience I would not have wanted to miss out on. It has given me the chance to get a taste of the ‘working life’ at a company that respects the autonomy of its employees by just trusting them: you’re even encouraged to pop off from your desk once in a while to play some Mario Kart or Guitar Hero (the image below illustrates this). Throughout the eight weeks, a lot of fun events were organized: for example, we went bouldering (kind of rock climbing), did Expedition Robinson on the beach and had a Great Gatsby themed party with lots of fancy drinks and well-dressed guests.
As of writing this, the team is halfway the final week; polishing of the web interface and small tweaks to the search engine are the focus of this sprint. The code repository has grown to a respectable size and the prototype is working: it’s spitting out pretty reasonable suggestions most of the time! In this short period of 8 weeks, it’s nice to see how the team has improved their efficiency by using tools like Gitlab and Jira. We’re creating separate branches every time we make a new feature, and as soon as the feature is pushed, it’s rigorously reviewed by another team member. It’s been a nice eight weeks and I don’t regret doing the internship instead of going on holiday this year; not only did I learn some valuable skills, I also got some nice bucks for it and met some very clever people.
If you’re interested in the more technical part of what we’ve actually researched and built, please read on!
The most fitting name for our software is a recommendation system. The use case scenario is one where a supporter is on the phone with a customer, and has typed in a description of the customer’s problem into an input field. Our AIKI-engine then processes this written text, along with other information that the supporter has filled in, and finally recommends a number of knowledge items that are most likely the solution. It’s like magic, but how does it do it?
Before the AI can do anything, we have to train the model on a dataset, preferably a large one. This dataset consists of all the previous incidents that servicedesk supporters have registered in the past. For our project, we used an in-house dataset from TOPdesk that contained roughly 30 000 incidents, along with the attached knowledge item that solved the incident. Next, we trained a natural language processing model on a clean subset of this data: Doc2Vec. Why only train it on a subset and not on the whole data set? This is due to the fact that we needed to validate our model on data it had never seen before. Therefore, we had to reserve a subset of the data to act as a test file.
How it works
After the model has been trained, it can be queried by feeding it a string of text. We chop up the string into tokens and remove any noisy elements, such as the words that are italicized in this sentence. Punctuation, dates and times are tossed out as well: what’s left is an array of strings that is fed into the model. The model does some mathemagics and returns the incidents in the database that are most similar to it. What we present to the user is then a carefully selected subset of the linked knowledge items of those incidents. The philosophy here is two-sided:
- If two incidents are similar, they probably have the same solution
- If the incident has a linked knowledge item, then that is probably the solution because a human linked that knowledge item.
By now it should be clear that with AIKI, we haven’t tried to build an omniscient, Skynet-like AI that is smarter than humans in every way; it just performs some quick similarity search on a huge database of previous incidents and basically crowdsources the accumulated knowledge of support operators to suggest a relevant knowledge item.
Is it any good?
To test the model we made, we showed it some data it had never seen before: the test data. Just like the test data, these were incidents that had been solved: they had a knowledge item linked to them. For every incident, we would show it to our model and ordered it to recommend 5 knowledge items. If one of those was the correct one, we considered it as a ‘hit’. With this metric, we have managed to achieve an accuracy of 39%. However, many false negatives will occur with this approach: not every incident has a clear solution, and some incidents have multiple possible solutions.
We also performed some user tests and they were promising! All users reported that at least one of the suggestions was useful in 5 out of the 6 sample incidents we gave them.
Naturally, the we need an interface with the real world: both for user tests and for manual tests. We cloned the TOPdesk first-line call form and connected a python server to it. This server queries the model and maintains an in-memory database of the knowledge items. The front end only sends GET requests to the back end and displays the result in a nice way, along with the confidence that the suggestion is relevant.