We’re on the brink of releasing a public beta of the new search functionality at KTH. This blog post will try to give a short explanation on whys and hows behind the project.
Why do we need this?
Although the current search solution at KTH is good, we still felt that it lacked in certain areas. With more and more information being migrated to KTH Social, and much of this information being visible only by certain, logged in, users, we had to find a smart way to incorporate all of this data in the main website search experience.
Apart from this, we also wanted to make use of the fact that we know much more about
our own sites than google does. We know what pages contain an interesting new research paper, what organizational unit a certain member of staff works at and what date and time an upcoming event is scheduled to take place. Instead of searching for people, places, courses and events and then navigating through different pages to find information about these entities, our users should be able to find what they’re looking for directly in the search results.
Opening up data from KTH Social and different entities in our systems is one important aspect of the search solution. Another is making all of this avaliable for other services. We wanted to make sure that everything the user can consume through our search box, he or she could also consume through a rich API. This way clever students, thrifty buisnesses or other educational institutions could create mobile apps, webpages and a whole lot more, incoportaing all of the work we put into making our search solution as good as it is.
Oh, and also; making you find what you’re looking for. Fast and easy.
How did we do it?
The first step toward making our search solution become what it is today was organizing and cleaning up all the stuff we wanted to be searchable. By creating rules for indexing on our 600+ domains here at KTH, collecting and sorting data that was spread out over different servers and systems, analyzing search data gathered from our old search solution and evaluating what kind of things people often searched for – we laid the foundation for the rest of the project.
Next came the identification of what we call entities. An entity is a news article, a calendar event, a course, a programme, a person and a whole lot of other things. These are things we know contain properties that sets them apart from just a regular web page. We knew we would know more about these properties than regular ”big” Google would, and therefore tried to incorporate as much of this as possible when creating the search experience.
By adding metadata-tagging to our entities, we we’re able to fetch all of the properties we
wanted, and presenting these exactly the way we wanted. The extraction of metatags are performed by our middle layer, the API. This sits between our actual hardware (a Google Search Appliance; GSA) and the search result page (SERP), acting as a translator between what our GSA has indexed and what we want to present. The API translates XML returned from the GSA to our own XML format, heavily based on schema.org. This free format is then consumable by whoever wants it, through our publically accessible API.
Our top layer just asks the API for a search result, and then presents it in a way that we think looks good, and that adheres to the graphical profile of KTH.
What does the future hold?
We started this project with the ambition of creating the best educational search experience to date. With this first beta release, we think we’ve taken a big step towards achieving this goal. However; we’re not done yet.
There are more entities to discover, smarter ways to present results and more data to be fetched, sorted and stored. There is currently work being done with greatly improving the personal profiles on the KTH web. We’ll make sure to follow the progression of projects like these closely, and make it an integral part of the search experience.
If you feel like something is missing in the current search solution, or that something is a little bit off – don’t hesitate to tell us! Just head over to the page Ideas and suggestions we are considering and spill your heart.
You can try out the new search here.