The web performance Summer Hackathon 2015

Back in January, we reported on the web performance hackathon we held late last year. It was a chance for our developers (and others) to unleash their creativity outside their normal working environment. By the end of it, we had not just a number of great concepts, but also some full, working solutions, including a way to get Performance Analyser working with continuous integration solutions such as Jenkins and Team City.

It was so successful that we decided to run another hackathon this summer. It was held on 22–23 July and this time, the theme was data insights.

Here’s a brief outline of some of the best ideas.

On Call and Monitoring

The first team was in fact a team of just one. Jon, one of our senior systems administrators, decided to look at alert statistics from our own internal monitoring service, which is used by our engineers to check our service delivery platforms. He examined when, why and how different alerts were sent to our on-call engineers and whether this could be improved on. For example, it’s not always a good idea to send (or receive) a text message in the early hours of the morning if the issue is relatively minor – or even if it’s important but not urgent.

Jon analysed the number of messages sent per incident and the likelihood of an alert being actionable. He then tried to enhance the solutions that monitor and alert on service availability rather than relying on the more traditional status details of individual services or hosts. The end result would be a significant efficiency saving for NCC Group (because of fewer unnecessary messages) and less inconvenience for on-call engineers (for the same reason!). Jon really impressed the panel of judges and won third prize.

Proactive Insight into Big Data

The next project was also from a single-member team, which makes its ambitious scope all the more impressive. Yuksel, one of our lead developers, applied machine learning and statistical analysis to try to emulate (and even improve on) the human element in interpreting monitoring data.

Currently, some human input is always required to filter out false positives, identify trends and make predictions. Under Yuksel’s proposed solution, trends that could easily be missed by an individual would be highlighted automatically. It would look for abnormal activity, based not on whether it amounted to an error but on how impactful it was to the system. Alerts wouldn’t necessarily be sent for every slowdown – only for those that were unexpected or hinted at a more serious problem.

In the screen grab below, every bubble shows the analysed tests against normal behaviour and trends. Different colours represent different levels of severity, while the radius shows the impact on performance.

For example, the green bubble is much bigger than the light blue bubble because, although load time in both tests was about the same, the green bubble’s test happened in the middle of the night, when load times are normally much faster. Similarly, the red bubble doesn’t represent a page in error, but a page that loaded in 21 seconds, compared to a typical time of around 5 seconds.

outlier-graph3

RUM Twits

This six-strong team overlaid the number of Tweets mentioning certain hashtags against page impressions over time using data from Twitter and the alpha version of NCC Group’s RUM (real user monitoring) service. In this way they would be able to correlate spikes in web traffic with spikes in social media activity.

They finished with a working demo that showed how Tweets mentioning the hashtag #webperf correlated with web traffic to this site.

Team “World Domination”

This team was made up entirely of professional services consultants and again looked at RUM data, building a dashboard that showed multiple dimensions in a single view.

RUM data was represented with circles of different sizes and colours, depending on user engagement and page performance (green circles for a fast page, progressing to red for slow). These circles were then plotted on a world map, using the Google Maps API, with RUM data aggregated by region.

worlddomination3

RUM-dingers

This was the largest team, with seven members, so we expected great things. We weren’t disappointed. As you can probably guess from the name, this team looked at how to get useful insights from RUM data.

The aim was to surface problems from segmented slices and visualise the data in a way that made it easy to determine the scale of any problem. There would be two reports that showed performance of each slice over time and a snapshot view showing the breadth of the issue.

The result was something that alerted users to problems that wouldn’t otherwise be seen in synthetic data or surfaced in RUM data without time-consuming reactive trial and error segmenting and manual analysis.

Team RUM-dingers won second prize.

anomaly_only_errors

Third Party Insight

This small team was made up of professional services consultant Tim (the inventor of the Web Page Toaster), with expert support from database administrator Igor.

The focus was on something that’s become a recurring theme at NCC Group Web Performance – the impact of third-party components.

The team looked at how to get more out of the vast amount of monitoring data we collect and present it in a useful way. They queried the database to find out:

  • how many organisations use a given third party
  • how many pages refer to that third party
  • how many calls were slow (over a given threshold), using data start time
  • how many calls were successful and how many were errors.

The end result was an accessible, aggregated view of third-party performance, which could prove incredibly useful to customers using those services, as well as to the third parties themselves.

For this reason, the judges awarded this team first prize – congratulations to Tim and Igor!

tpinsight

Once again, the hackathon proved to be an excellent way to get our talented developers and consultants to think creatively, away from the constraints of the day-to-day routine. And once again, they came up with a range of concepts and prototypes that are likely to find their way into our finished products.

Well done to all involved.


Andrew Darnell

Andrew is Engineering Manager at NCC Group Web Performance.

Leave a Reply

Your email address will not be published. Required fields are marked *