New search solution on KTH

We started before the summer in 2017 by probing the market to find out which search service options were available. These came in different flavours and met different needs but they can be roughly divided in to the following categories…

Foto by JOHN TOWNER

Search on the website of KTH (www.kth.se/search) was prior to 2010 based on Google’s so called “Custom Search”, a partially advertise driven free version that was able to deliver results of public content but without any real integration capabilities. That’s why we started to work in 2009 on a new search solution.

History and background

With the new solution a lot of new requirements were added and one of those were the possibility to search for protected content as a logged in user. We also wanted the search service to be integrated into the design and to be a natural part of the website. Other functionality that were requested were the possibility of prioritising content such as systems, courses, persons etc. as well as the possibility to analyse the use to be able to work with continuous improvements.

In 2010, the search service was launched with Google Search Appliance (GSA) as the underlying search engine. Since then, it has delivered search results to the visitors of KTH, but in 2017, Google announced that GSA as a product would expire so we had to put the search on the drawing board again.

A new solution – what have we done so far?

We started before the summer in 2017 by probing the market to find out which search service options were available. These came in different flavours and met different needs but they can be roughly divided in to the following categories:

  • “Build everything yourself” with components. (Solr or Elastic based)
  • “Buy a cloud based version” that only index your public content.
  • “Buy a GSA copy” that comes with hard- and software and then build a custom search interface upon that.
  • “Buy a software license”, install the product and build your own search interface.

We did chose the last of these options and has since the autumn of 2017 util now been developing a new search service based on a product called ayfie Locator.

Startsida för KTH:s söktjänst.
The start page of the search service at KTH Royal Institute of Technology.

The installation is based on two components, ayfie Locator with Solr as the search engine (including the crawler) and ayfie Predictor that delivers autocomplete suggestions when the user starts typing the search phrase in the search field.

Integrations to push content

How data is added to the search index.

In order to get better control of what’s being indexed and when it happens, we have built an integration layer based on an REST API that can retrieve and deliver content to the search engine in a structured way via a database connector. With this in place, we have integrated a number of applications that now push information when content is created, updated or deleted and so far, we push information like:

  • Program
  • Courses
  • People
  • Facilities
  • System
  • Groups

When the information is added to the index, it is possible to search for it in our search interface at www.kth.se/search

Below is a schematic outline of how a search query enters the system and how the response is then returned to the user. This also applies to the search suggestions that are generated when the user starts typing in the search field.

How a search query travels through the system and how the response is delivered.

Technology consolidation

In order to get better efficiency regarding system management, we chose to the consolidate the development stack by switching from Java to Javascript in Node.js with Inferno as the frontend framework. This is based on a strategic decision from the IT department a few years back. The old search service was one of the last Java applications we have in the web team.

New interface with new functionality

When we started the development of the new search service, there were mainly three things we wanted to improve. The first was the Swedish language support and the other two things were:

Autocomplete proposal when searching

Automatiska förslag från sökindexet vid inmatning av sökfras. Sk. autocomplete.
Automatic suggestions from search index when entering search phrases. A so called autocomplete.

Due to the fact that the website of KTH is quite large and a search query can touch many different types of areas, we try to guide the user correctly by providing both text suggestions to avoid misspellings etc. but also suggestions on given entities like people, programs, courses etc.

This means that a query that matches both a study programme and a number of courses can be found at an early stage by selecting the proposal even before the search query has been set.

Filter options

A filter gives the visitor a simple and quick opportunity to remove a very large amount of irrelevant search results based on that the visitor know what he or she it is looking for and then get a much better chance of finding what’s relevant.

There are a couple of filters you can use and the image to the right shows some different entity filters and file type filters.

Enhancement: When given time, we will probably add a date filter as well, but date filters are based on the source systems that deliver content to the search service and that they can  provide information when content has been created / updated.

Other improvements that have been made are:

  • Improved selection of information in the search service (less noise)
  • More protected information included and searchable
  • Better selection of courses that are searchable
  • More information is pushed which gives better control of what is searchable and a faster way of getting the content searchable
  • Improved ways of searching for persons (For example, by name, email, phone numbers, phone extension, KTH-ID, ORCID ID, username)
  • More entities such as programs, groups, and campaigns
  • Improved analysis that enables continuous improvement in search quality
  • Feedback with metadata
  • Ability to update application without downtime for users
  • etc.

Regards,
Niklas

Avatar
My name is Niklas and I work as a IT Solution Manager in the team responsible for our main CMS, blog platforms, search service and more. I use this blog to share information about new projects and systems that we develop for our users at the university.

Leave a Reply

Your e-mail address will not be published. Required fields are marked *